Pull perf/core improvements and fixes from Jiri Olsa:
* Wire up perf_regs and unwind support for ARM64 (Jean Pihet)
* Move u64_swap union to its single user's header, evsel.h (Borislav Petkov)
* Fix for s390 to properly parse tracepoints plus test code (Alexander Yarygin)
* Handle EINTR error for readn/writen (Namhyung Kim)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Modules installed outside of the kernel's build system should go into
"%s/lib/modules/%s/extra", but at present, perf will only look at them
when they are in "%s/lib/modules/%s/kernel". Lets encourage good
citizenship by relaxing this requirement to "%s/lib/modules/%s". This
way open source modules that are out-of-tree have no incentive to start
populating a directory reserved for in-kernel modules and I can stop
hex-editing my system's perf binary when profiling OSS out-of-tree
modules.
Feedback from Namhyung Kim correctly revealed that the hex-edits that I
had been doing meant that perf was also traversing the build and source
symlinks in %s/lib/modules/%s. That is undesireable, so we explicitly
exclude them from traversal with a minor tweak to the traversal routine.
Signed-off-by: Richard Yao <ryao@gentoo.org>
Acked-by: Namhyung kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398532675-13684-1-git-send-email-ryao@gentoo.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Trace events potentially can have a '-' in their trace system name,
e.g. kvm on s390 defines kvm-s390:* tracepoints.
We could not parse them, because there was no rule for this:
$ sudo ./perf top -e "kvm-s390:*"
invalid or unsupported event: 'kvm-s390:*'
This patch adds an extra rule to event_legacy_tracepoint which handles
those cases. Without the patch, perf will not accept such tracepoints in
the -e option.
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/r/1398440047-6641-2-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Pull perf/core improvements and fixes from Jiri Olsa:
* Factor hists statistics counts processing which in turn also
fixes several bugs in TUI report command (Namhyung Kim)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The hist_browser__reset() is only called right after a filter is
applied so it needs to udpate browser->nr_entries properly. We cannot
use hists->nr_non_filtered_entreis directly since it's possible that
such entries are also filtered out by minimum percentage limit.
In addition when a filter is used for perf top, hist browser's
nr_entries field was not updated after applying the filter. But it
needs to be updated as new samples are coming.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-11-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Currently, accounting each sample is done in multiple places - once
when adding them to the input tree, other when adding them to the
output tree. It's not only confusing but also can cause a subtle
problem since concurrent processing like in perf top might see the
updated stats before adding entries into the output tree - like seeing
more (blank) lines at the end and/or slight inaccurate percentage.
To fix this, only account the entries when it's moved into the output
tree so that they cannot be seen prematurely. There're some
exceptional cases here and there - they should be addressed separately
with comments.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
This patch figures out the max number of cpus and nodes that are on the
system and creates a map of cpu to node. This allows us to provide a cpu
and quickly get the node associated with it.
It was mostly copied from builtin-kmem.c and tweaked slightly to use less memory
(use possible cpus instead of max). It also calculates the max number of nodes.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1396896924-129847-2-git-send-email-dzickus@redhat.com
[ Removing out label code in init_cpunode_map ]
[ Adding check for snprintf error ]
[ Removing unneeded returns ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
In the current version, when using perf record, if something goes
wrong in tools/perf/builtin-record.c:375
session = perf_session__new(file, false, NULL);
The error message:
"Not enough memory for reading per file header"
is issued. This error message seems to be outdated and is not very
helpful. This patch proposes to replace this error message by
"Perf session creation failed"
I believe this issue has been brought to lkml:
https://lkml.org/lkml/2014/2/24/458
although this patch only tackles a (small) part of the issue.
Additionnaly, this patch improves error reporting in
tools/perf/util/data.c open_file_write.
Currently, if the call to open fails, the user is unaware of it.
This patch logs the error, before returning the error code to
the caller.
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Adrien BAK <adrien.bak@metascale.org>
Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast
[ Reorganize the changelog into paragraphs ]
[ Added empty line after fd declaration in open_file_write ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
pert-report doesn't resolve function names in VDSO:
$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
8.76%
0x7fff6b1fe861
__gettimeofday
ACE_OS::gettimeofday()
...
In this case symbol values should be adjusted the same way as for executables,
relocatable objects and prelinked libraries.
After fix:
$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
8.76%
__vdso_gettimeofday
__gettimeofday
ACE_OS::gettimeofday()
Signed-off-by: Vladimir Nikulichev <nvs@tbricks.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvs
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Add hist.percentage option for setting default value of the
symbol_conf.filter_relative. It affects the output of various perf
commands (like perf report, top and diff) only if filter(s) applied.
An user can write .perfconfig file like below to show absolute
percentage of filtered entries by default:
$ cat ~/.perfconfig
[hist]
percentage = absolute
And it can be changed through command line:
$ perf report --percentage relative
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".
Move the parser callback function into a common location since it's
used by multiple commands now.
For more information, please see previous commit same thing done to
"perf report".
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".
"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%. "absolute" means it retains
the original value before and after the filter is applied.
$ perf report -s comm
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc
6.11% as
4.35% sh
4.14% make
1.13% fixdep
...
$ perf report -s comm -c cc1,gcc --percentage absolute
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc
$ perf report -s comm -c cc1,gcc --percentage relative
# Overhead Command
# ........ ............
#
90.69% cc1
9.31% gcc
Note that it has zero effect if no filter was applied.
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
When filtering by thread, dso or symbol on TUI it also update total
period so that the output shows different result than no filter - the
percentage changed to relative to filtered entries only. Sometimes
this is not desired since users might expect same results with filter.
So new filtered_* fields to hists->stats to count them separately.
They'll be controlled/used by user later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
As Namhyung reported(https://lkml.org/lkml/2014/4/1/89),
current perf-probe -L option doesn't handle errors in line-range
searching correctly. It causes a SEGV if an error occured in the
line-range searching.
----
$ perf probe -x ./perf -v -L map__load
Open Debuginfo file: /home/namhyung/project/linux/tools/perf/perf
fname: util/map.c, lineno:153
New line range: 153 to 2147483647
path: (null)
Segmentation fault (core dumped)
----
This is because line_range_inline_cb() ignores errors
from find_line_range_by_line() which means that lr->path is
already freed on the error path in find_line_range_by_line().
As a result, get_real_path() accesses the lr->path and it
causes a NULL pointer exception.
This fixes line_range_inline_cb() to handle the error correctly,
and report it to the caller.
Anyway, this just fixes a possible SEGV bug, Namhyung's patch
is also required.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20140402054831.19080.27006.stgit@ltc230.yrl.intra.hitachi.co.jp
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Instead of bailing out as soon as we find a filter that applies, go on
checking all of them so that we can zoom in/out filters.
We also need to make sure we only update al->filtered after
thread__find_addr_map(), because there is where al->filtered gets
initialized to zero.
This will increase the cost of processing when all we don't need this
toggling, but will provide flexibility for the TUI and GTK+ interfaces,
that will incur in creating the hist_entries just once.
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-fhv9lhzdjxgp9w3w3668lsfw@git.kernel.org
[ yanked this out of a previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When trying to capture perf data on a system running spejbb2013, perf
hung for about 15 minutes. This is because it took that long to gather
about 10,000 thread maps and process them.
I don't think a user wants to wait that long.
Instead, recognize that thread maps are roughly equivalent to pid maps
and just quickly copy those instead.
To do this, I synthesize 'fork' events, this eventually calls
thread__fork() and copies the maps over.
The overhead goes from 15 minutes down to about a few seconds.
--
V2: based on Jiri's comments, moved malloc up a level
and made sure the memory was freed
Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Link: http://lkml.kernel.org/r/1394808224-113774-1-git-send-email-dzickus@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>