clang warns:
util/block-info.c:298:18: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:51: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:18: error: result of comparison against a string
literal is unspecified (use an explicit string
comparison function instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/block-info.c:298:51: error: result of comparison against a string
literal is unspecified (use an explicit string comparison function
instead) [-Werror,-Wstring-compare]
if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
^ ~~~~~~~~~~~~~~~
util/map.c:434:15: error: result of comparison against a string literal
is unspecified (use an explicit string comparison function instead)
[-Werror,-Wstring-compare]
if (srcline != SRCLINE_UNKNOWN)
^ ~~~~~~~~~~~~~~~
Reviewer Notes:
Looks good to me. Some more context:
https://clang.llvm.org/docs/DiagnosticsReference.html#wstring-compare
The spec says:
J.1 Unspecified behavior
The following are unspecified:
.. Whether two string literals result in distinct arrays (6.4.5).
Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Keeping <john@metanate.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: clang-built-linux@googlegroups.com
Link: https://github.com/ClangBuiltLinux/linux/issues/900
Link: http://lore.kernel.org/lkml/20200223193456.25291-1-nick.desaulniers@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
`tools/perf/util/map.c` has a function named `maps__insert` that
acquires a write lock if its in multithread context.
Even though this lock is released when function successfully completes,
there's a branch that is executed when `maps_by_name == NULL` that
returns from this function without releasing the write lock.
Added an `up_write` to release the lock when this happens.
Fixes: a7c2b572e2 ("perf map_groups: Auto sort maps by name, if needed")
Signed-off-by: Cengiz Can <cengiz@kernel.wtf>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20200120141553.23934-1-cengiz@kernel.wtf
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And pick the shortest name: 'struct maps'.
The split existed because we used to have two groups of maps, one for
functions and one for variables, but that only complicated things,
sometimes we needed to figure out what was at some address and then had
to first try it on the functions group and if that failed, fall back to
the variables one.
That split is long gone, so for quite a while we had only one struct
maps per struct map_groups, simplify things by combining those structs.
First patch is the minimum needed to merge both, follow up patches will
rename 'thread->mg' to 'thread->maps', etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And take it into account when looking up DSOs when we have the dso_id
fields obtained from somewhere, like from PERF_RECORD_MMAP2 records.
Instances of struct map pointing to the same DSO pathname but with
anything in dso_id different are in fact different DSOs, so better have
different 'struct dso' instances to reflect that. At some point we may
want to get copies of the contents of the different objects if we want
to do correct annotation or other analysis.
With this we get 'struct map' 24 bytes leaner:
$ pahole -C map ~/bin/perf
struct map {
union {
struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */
struct list_head node; /* 0 16 */
} __attribute__((__aligned__(8))); /* 0 24 */
u64 start; /* 24 8 */
u64 end; /* 32 8 */
_Bool erange_warned:1; /* 40: 0 1 */
_Bool priv:1; /* 40: 1 1 */
/* XXX 6 bits hole, try to pack */
/* XXX 3 bytes hole, try to pack */
u32 prot; /* 44 4 */
u64 pgoff; /* 48 8 */
u64 reloc; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
u64 (*map_ip)(struct map *, u64); /* 64 8 */
u64 (*unmap_ip)(struct map *, u64); /* 72 8 */
struct dso * dso; /* 80 8 */
refcount_t refcnt; /* 88 4 */
u32 flags; /* 92 4 */
/* size: 96, cachelines: 2, members: 13 */
/* sum members: 92, holes: 1, sum holes: 3 */
/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
/* forced alignments: 1 */
/* last cacheline: 32 bytes */
} __attribute__((__aligned__(8)));
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-g4hxxmraplo7wfjmk384mfsb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And this patch highlights where these fields are being used: in the sort
order where it uses it to compare maps and classify samples taking into
account not just the DSO, but those DSO id fields.
I think these should be used to differentiate DSOs with the same name
but different 'struct dso_id' fields, i.e. these fields should move to
'struct dso' and then be used as part of the key when doing lookups for
DSOs, in addition to the DSO name.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8v5isitqy0dup47nnwkpc80f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are still lots of lookups by name, even if just when loading
vmlinux, till that code is studied to figure out if its possible to do
away with those map lookup by names, provide a way to sort it using
libc's qsort/bsearch.
Doing it at the first lookup defers the sorting a bit, and as the code
stands now, is never done for user maps, just for the kernel ones.
# perf probe -l
# perf probe -x ~/bin/perf -L __map_groups__find_by_name
<__map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
0 static struct map *__map_groups__find_by_name(struct map_groups *mg, const char *name)
1 {
struct map **mapp;
4 if (mg->maps_by_name == NULL &&
5 map__groups__sort_by_name_from_rbtree(mg))
6 return NULL;
8 mapp = bsearch(name, mg->maps_by_name, mg->nr_maps, sizeof(*mapp), map__strcmp_name);
9 if (mapp)
10 return *mapp;
11 return NULL;
12 }
struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
{
# perf probe -x ~/bin/perf 'found=__map_groups__find_by_name:10 name:string'
Added new event:
probe_perf:found (on __map_groups__find_by_name:10 in /home/acme/bin/perf with name:string)
You can now use it in all perf tools, such as:
perf record -e probe_perf:found -aR sleep 1
#
# perf probe -x ~/bin/perf -L map_groups__find_by_name
<map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
0 struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
1 {
2 struct maps *maps = &mg->maps;
struct map *map;
5 down_read(&maps->lock);
7 if (mg->last_search_by_name && strcmp(mg->last_search_by_name->dso->short_name, name) == 0) {
8 map = mg->last_search_by_name;
9 goto out_unlock;
}
/*
* If we have mg->maps_by_name, then the name isn't in the rbtree,
* as mg->maps_by_name mirrors the rbtree when lookups by name are
* made.
*/
16 map = __map_groups__find_by_name(mg, name);
17 if (map || mg->maps_by_name != NULL)
18 goto out_unlock;
/* Fallback to traversing the rbtree... */
21 maps__for_each_entry(maps, map)
22 if (strcmp(map->dso->short_name, name) == 0) {
23 mg->last_search_by_name = map;
24 goto out_unlock;
}
27 map = NULL;
out_unlock:
30 up_read(&maps->lock);
31 return map;
32 }
int dso__load_vmlinux(struct dso *dso, struct map *map,
const char *vmlinux, bool vmlinux_allocated)
# perf probe -x ~/bin/perf 'fallback=map_groups__find_by_name:21 name:string'
Added new events:
probe_perf:fallback (on map_groups__find_by_name:21 in /home/acme/bin/perf with name:string)
probe_perf:fallback_1 (on map_groups__find_by_name:21 in /home/acme/bin/perf with name:string)
You can now use it in all perf tools, such as:
perf record -e probe_perf:fallback_1 -aR sleep 1
#
# perf probe -l
probe_perf:fallback (on map_groups__find_by_name:21@util/symbol.c in /home/acme/bin/perf with name_string)
probe_perf:fallback_1 (on map_groups__find_by_name:21@util/symbol.c in /home/acme/bin/perf with name_string)
probe_perf:found (on __map_groups__find_by_name:10@util/symbol.c in /home/acme/bin/perf with name_string)
#
# perf stat -e probe_perf:*
Now run 'perf top' in another term and then, after a while, stop 'perf stat':
Furthermore, if we ask for interval printing, we can see that that is done just
at the start of the workload:
# perf stat -I1000 -e probe_perf:*
# time counts unit events
1.000319513 0 probe_perf:found
1.000319513 0 probe_perf:fallback_1
1.000319513 0 probe_perf:fallback
2.001868092 23,251 probe_perf:found
2.001868092 0 probe_perf:fallback_1
2.001868092 0 probe_perf:fallback
3.002901597 0 probe_perf:found
3.002901597 0 probe_perf:fallback_1
3.002901597 0 probe_perf:fallback
4.003358591 0 probe_perf:found
4.003358591 0 probe_perf:fallback_1
4.003358591 0 probe_perf:fallback
^C
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c5lmbyr14x448rcfii7y6t3k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Lets see if it helps:
First look at the probeable lines for the function that does lookups by
name in a map_groups struct:
# perf probe -x ~/bin/perf -L map_groups__find_by_name
<map_groups__find_by_name@/home/acme/git/perf/tools/perf/util/symbol.c:0>
0 struct map *map_groups__find_by_name(struct map_groups *mg, const char *name)
1 {
2 struct maps *maps = &mg->maps;
struct map *map;
5 down_read(&maps->lock);
7 if (mg->last_search_by_name && strcmp(mg->last_search_by_name->dso->short_name, name) == 0) {
8 map = mg->last_search_by_name;
9 goto out_unlock;
}
12 maps__for_each_entry(maps, map)
13 if (strcmp(map->dso->short_name, name) == 0) {
14 mg->last_search_by_name = map;
15 goto out_unlock;
}
18 map = NULL;
out_unlock:
21 up_read(&maps->lock);
22 return map;
23 }
int dso__load_vmlinux(struct dso *dso, struct map *map,
const char *vmlinux, bool vmlinux_allocated)
#
Now add a probe to the place where we reuse the last search:
# perf probe -x ~/bin/perf map_groups__find_by_name:8
Added new event:
probe_perf:map_groups__find_by_name (on map_groups__find_by_name:8 in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:map_groups__find_by_name -aR sleep 1
#
Now lets do a system wide 'perf stat' counting those events:
# perf stat -e probe_perf:*
Leave it running and lets do a 'perf top', then, after a while, stop the
'perf stat':
# perf stat -e probe_perf:*
^C
Performance counter stats for 'system wide':
3,603 probe_perf:map_groups__find_by_name
44.565253139 seconds time elapsed
#
yeah, good to have.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tcz37g3nxv3tvxw3q90vga3p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we pass that substructure around and with it consolidate lots of
functions that receive a (map, symbol) pair and now can receive just a
'struct map_symbol' pointer.
This further paves the way to add 'struct map_groups' to 'struct
map_symbol' so that we can have all we need for annotation so that we
can ditch 'struct map'->groups, i.e. have the map_groups pointer in a
more central place, avoiding the pointer in the 'struct map' that have
tons of instances.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-fs90ttd9q12l7989fo7pw81q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Whenever an mmap/mmap2 event occurs, the map tree must be updated to add a new
entry. If a new map overlaps a previous map, the overlapped section of the
previous map is effectively unmapped, but the non-overlapping sections are
still valid.
maps__fixup_overlappings() is responsible for creating any new map entries from
the previously overlapped map. It optionally creates a before and an after map.
When creating the after map the existing code failed to adjust the map.pgoff.
This meant the new after map would incorrectly calculate the file offset
for the ip. This results in incorrect symbol name resolution for any ip in the
after region.
Make maps__fixup_overlappings() correctly populate map.pgoff.
Add an assert that new mapping matches old mapping at the beginning of
the after map.
Committer-testing:
Validated correct parsing of libcoreclr.so symbols from .NET Core 3.0 preview9
(which didn't strip symbols).
Preparation:
~/dotnet3.0-preview9/dotnet new webapi -o perfSymbol
cd perfSymbol
~/dotnet3.0-preview9/dotnet publish
perf record ~/dotnet3.0-preview9/dotnet \
bin/Debug/netcoreapp3.0/publish/perfSymbol.dll
^C
Before:
perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\
grep libcoreclr.so | head -n 4
dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \
r-xp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.705249: 250000 cpu-clock: \
7fe6159a1f99 [unknown] \
(.../3.0.0-preview9-19423-09/libcoreclr.so)
After:
perf script --show-mmap-events 2>&1 | grep -e MMAP -e unknown |\
grep libcoreclr.so | head -n 4
dotnet 1907 373352.698780: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615726000(0x768000) @ 0 08:02 5510620 765057155]: \
r-xp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701091: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615974000(0x1000) @ 0x24e000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
dotnet 1907 373352.701241: PERF_RECORD_MMAP2 1907/1907: \
[0x7fe615c42000(0x1000) @ 0x51c000 08:02 5510620 765057155]: \
rwxp .../3.0.0-preview9-19423-09/libcoreclr.so
All the [unknown] symbols were resolved.
Signed-off-by: Steve MacLean <Steve.MacLean@Microsoft.com>
Tested-by: Brian Robbins <brianrob@microsoft.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: John Keeping <john@metanate.com>
Cc: John Salem <josalem@microsoft.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom McDonald <thomas.mcdonald@microsoft.com>
Link: http://lore.kernel.org/lkml/BN8PR21MB136270949F22A6A02335C238F7800@BN8PR21MB1362.namprd21.prod.outlook.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit e5adfc3e7e ("perf map: Synthesize maps only for thread group
leader") changed the recording side so that we no longer get mmap events
for threads other than the thread group leader (when synthesising these
events for threads which exist before perf is started).
When a file recorded after this change is loaded, the lack of mmap
records mean that unwinding is not set up for any other threads.
This can be seen in a simple record/report scenario:
perf record --call-graph=dwarf -t $TID
perf report
If $TID is a process ID then the report will show call graphs, but if
$TID is a secondary thread the output is as if --call-graph=none was
specified.
Following the rationale in that commit, move the libunwind fields into
struct map_groups and update the libunwind functions to take this
instead of the struct thread. This is only required for
unwind__finish_access which must now be called from map_groups__delete
and the others are changed for symmetry.
Note that unwind__get_entries keeps the thread argument since it is
required for symbol lookup and the libdw unwind provider uses the thread
ID.
Signed-off-by: John Keeping <john@metanate.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e5adfc3e7e ("perf map: Synthesize maps only for thread group leader")
Link: http://lkml.kernel.org/r/20190815100146.28842-2-john@metanate.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By calling maps__insert() we assume to get 2 references on the map,
which we relese within maps__remove call.
However if there's already same map name, we currently don't bump the
reference and can crash, like:
Program received signal SIGABRT, Aborted.
0x00007ffff75e60f5 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6
#1 0x00007ffff75d0895 in abort () from /lib64/libc.so.6
#2 0x00007ffff75d0769 in __assert_fail_base.cold () from /lib64/libc.so.6
#3 0x00007ffff75de596 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000004fc006 in refcount_sub_and_test (i=1, r=0x1224e88) at tools/include/linux/refcount.h:131
#5 refcount_dec_and_test (r=0x1224e88) at tools/include/linux/refcount.h:148
#6 map__put (map=0x1224df0) at util/map.c:299
#7 0x00000000004fdb95 in __maps__remove (map=0x1224df0, maps=0xb17d80) at util/map.c:953
#8 maps__remove (maps=0xb17d80, map=0x1224df0) at util/map.c:959
#9 0x00000000004f7d8a in map_groups__remove (map=<optimized out>, mg=<optimized out>) at util/map_groups.h:65
#10 machine__process_ksymbol_unregister (sample=<optimized out>, event=0x7ffff7279670, machine=<optimized out>) at util/machine.c:728
#11 machine__process_ksymbol (machine=<optimized out>, event=0x7ffff7279670, sample=<optimized out>) at util/machine.c:741
#12 0x00000000004fffbb in perf_session__deliver_event (session=0xb11390, event=0x7ffff7279670, tool=0x7fffffffc7b0, file_offset=13936) at util/session.c:1362
#13 0x00000000005039bb in do_flush (show_progress=false, oe=0xb17e80) at util/ordered-events.c:243
#14 __ordered_events__flush (oe=0xb17e80, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:322
#15 0x00000000005005e4 in perf_session__process_user_event (session=session@entry=0xb11390, event=event@entry=0x7ffff72a4af8,
...
Add the map to the list and getting the reference event if we find the
map with same name.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Fixes: 1e6285699b ("perf symbols: Fix slowness due to -ffunction-section")
Link: http://lkml.kernel.org/r/20190416160127.30203-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When looking at PT or brstackinsn traces with 'perf script' it can be
very useful to see the source code. This adds a simple facility to print
them with 'perf script', if the information is available through dwarf
% perf record ...
% perf script -F insn,ip,sym,srccode
...
4004c6 main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004c6 main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004cd main
5 for (i = 0; i < 10000000; i++)
4004b3 main
6 v++;
% perf record -b ...
% perf script -F insn,ip,sym,srccode,brstackinsn
...
main+22:
0000000000400543 insn: e8 ca ff ff ff # PRED
|18 f1();
f1:
0000000000400512 insn: 55
|10 {
0000000000400513 insn: 48 89 e5
0000000000400516 insn: b8 00 00 00 00
|11 f2();
000000000040051b insn: e8 d6 ff ff ff # PRED
f2:
00000000004004f6 insn: 55
|5 {
00000000004004f7 insn: 48 89 e5
00000000004004fa insn: 8b 05 2c 0b 20 00
|6 c = a / b;
0000000000400500 insn: 8b 0d 2a 0b 20 00
0000000000400506 insn: 99
0000000000400507 insn: f7 f9
0000000000400509 insn: 89 05 29 0b 20 00
000000000040050f insn: 90
|7 }
0000000000400510 insn: 5d
0000000000400511 insn: c3 # PRED
f1+14:
0000000000400520 insn: b8 00 00 00 00
|12 f2();
0000000000400525 insn: e8 cc ff ff ff # PRED
f2:
00000000004004f6 insn: 55
|5 {
00000000004004f7 insn: 48 89 e5
00000000004004fa insn: 8b 05 2c 0b 20 00
|6 c = a / b;
Not supported for callchains currently, would need some layout changes
there.
Committer notes:
Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this
warning:
In file included from util/srccode.c:19:0:
/usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp]
#warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
^~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf can take minutes to parse an image when -ffunction-section is used.
This is especially true with the kernel image when it is compiled this
way, which is the arm64 default since the patcheset "Enable deadcode
elimination at link time".
Perf organize maps using a rbtree. Whenever perf finds a new symbols, it
first searches this rbtree for the map it belongs to, by strcmp()'aring
section names. When it finds the map with the right name, it uses it to
add the symbol. With a usual image there aren't so many maps but when
using -ffunction-section there's basically one map per function. With
the kernel image that's north of 40,000 maps. For most symbols perf has
to parses the entire rbtree to eventually create a new map and add it.
Consequently perf spends most of the time browsing a rbtree that keeps
getting larger.
This performance fix introduces a secondary rbtree that indexes maps
based on the section name.
Signed-off-by: Eric Saint-Etienne <eric.saint.etienne@oracle.com>
Reviewed-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: David Aldridge <david.aldridge@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1542822679-25591-1-git-send-email-eric.saint.etienne@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 1c5aae7710 ("perf machine: Create maps for x86 PTI entry
trampolines") revealed a problem with maps__find_symbol_by_name() that
resulted in probes not being found e.g.
$ sudo perf probe xsk_mmap
xsk_mmap is out of .text, skip it.
Probe point 'xsk_mmap' not found.
Error: Failed to add events.
maps__find_symbol_by_name() can optionally return the map of the found
symbol. It can get the map wrong because, in fact, the symbol is found
on the map's dso, not allowing for the possibility that the dso has more
than one map. Fix by always checking the map contains the symbol.
Reported-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 1c5aae7710 ("perf machine: Create maps for x86 PTI entry trampolines")
Link: http://lkml.kernel.org/r/20180907085116.25782-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>