arch/x86/tests/regs_load.S is missing the linker note about the stack
requirements, therefore making the linker fall back to an executable
stack. As this object gets linked against the final perf binary, it'll
needlessly end up with an executable stack. Fix this by adding the
appropriate linker note.
Also add a global linker flag to prevent future regressions, as
suggested by Jiri. This way perf won't get an executable stack even if
we fail to add the .GNU-stack linker note to future assembler files.
Though, doing so might create regressions the other way around, when
(statically) linking against libraries needing an executable stack.
But, apparently, regressing in that direction is wanted as it is an
indicator of poor code quality -- or just missing linker notes.
Fixes: 3c8b06f981 ("perf tests x86: Introduce perf_regs_load function")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1398617466-22749-1-git-send-email-minipli@googlemail.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Modules installed outside of the kernel's build system should go into
"%s/lib/modules/%s/extra", but at present, perf will only look at them
when they are in "%s/lib/modules/%s/kernel". Lets encourage good
citizenship by relaxing this requirement to "%s/lib/modules/%s". This
way open source modules that are out-of-tree have no incentive to start
populating a directory reserved for in-kernel modules and I can stop
hex-editing my system's perf binary when profiling OSS out-of-tree
modules.
Feedback from Namhyung Kim correctly revealed that the hex-edits that I
had been doing meant that perf was also traversing the build and source
symlinks in %s/lib/modules/%s. That is undesireable, so we explicitly
exclude them from traversal with a minor tweak to the traversal routine.
Signed-off-by: Richard Yao <ryao@gentoo.org>
Acked-by: Namhyung kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398532675-13684-1-git-send-email-ryao@gentoo.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
In tests/parse-events.c test cases are declared in evlist_test[]
arrays. Elements of arrays are initialized in following pattern:
[i] = {
.name = ...,
.check = ...,
},
When perf-test is running with '-v' option, 'i' variable will be
printed for every existing test.
However, we can't add any arch specific tests inside #ifdefs, because it
will create collision between the element number inside #ifdef and the
next one outside.
This patch adds 'id' field in evlist_test, uses it as a test
identifier and removes explicit numbering of array elements. This helps
to number tests with gaps.
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1398440047-6641-3-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Trace events potentially can have a '-' in their trace system name,
e.g. kvm on s390 defines kvm-s390:* tracepoints.
We could not parse them, because there was no rule for this:
$ sudo ./perf top -e "kvm-s390:*"
invalid or unsupported event: 'kvm-s390:*'
This patch adds an extra rule to event_legacy_tracepoint which handles
those cases. Without the patch, perf will not accept such tracepoints in
the -e option.
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: http://lkml.kernel.org/r/1398440047-6641-2-git-send-email-yarygin@linux.vnet.ibm.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Pull perf/core improvements and fixes from Jiri Olsa:
* Factor hists statistics counts processing which in turn also
fixes several bugs in TUI report command (Namhyung Kim)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The hist_browser__reset() is only called right after a filter is
applied so it needs to udpate browser->nr_entries properly. We cannot
use hists->nr_non_filtered_entreis directly since it's possible that
such entries are also filtered out by minimum percentage limit.
In addition when a filter is used for perf top, hist browser's
nr_entries field was not updated after applying the filter. But it
needs to be updated as new samples are coming.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-11-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
The nr_entries variable is increased inside the loop in the function
but it always count the first entry regardless of it's filtered or
not; caused an off-by-one error.
It'd become a problem especially there's no entry at all - it'd get a
segfault during referencing a NULL pointer.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-9-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Currently, accounting each sample is done in multiple places - once
when adding them to the input tree, other when adding them to the
output tree. It's not only confusing but also can cause a subtle
problem since concurrent processing like in perf top might see the
updated stats before adding entries into the output tree - like seeing
more (blank) lines at the end and/or slight inaccurate percentage.
To fix this, only account the entries when it's moved into the output
tree so that they cannot be seen prematurely. There're some
exceptional cases here and there - they should be addressed separately
with comments.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
The hists->nr_entries is counted in multiple places so that they can
confuse readers of the code. This is a preparation of later change
and do not intend any functional difference.
Note that report__collapse_hists() now changed to return nothing since
its return value (nr_samples) is only for checking if there's any data
in the input file and this can be acheived by checking ->nr_entries.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
This patch figures out the max number of cpus and nodes that are on the
system and creates a map of cpu to node. This allows us to provide a cpu
and quickly get the node associated with it.
It was mostly copied from builtin-kmem.c and tweaked slightly to use less memory
(use possible cpus instead of max). It also calculates the max number of nodes.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1396896924-129847-2-git-send-email-dzickus@redhat.com
[ Removing out label code in init_cpunode_map ]
[ Adding check for snprintf error ]
[ Removing unneeded returns ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
In the current version, when using perf record, if something goes
wrong in tools/perf/builtin-record.c:375
session = perf_session__new(file, false, NULL);
The error message:
"Not enough memory for reading per file header"
is issued. This error message seems to be outdated and is not very
helpful. This patch proposes to replace this error message by
"Perf session creation failed"
I believe this issue has been brought to lkml:
https://lkml.org/lkml/2014/2/24/458
although this patch only tackles a (small) part of the issue.
Additionnaly, this patch improves error reporting in
tools/perf/util/data.c open_file_write.
Currently, if the call to open fails, the user is unaware of it.
This patch logs the error, before returning the error code to
the caller.
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Adrien BAK <adrien.bak@metascale.org>
Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast
[ Reorganize the changelog into paragraphs ]
[ Added empty line after fd declaration in open_file_write ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
pert-report doesn't resolve function names in VDSO:
$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
8.76%
0x7fff6b1fe861
__gettimeofday
ACE_OS::gettimeofday()
...
In this case symbol values should be adjusted the same way as for executables,
relocatable objects and prelinked libraries.
After fix:
$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
8.76%
__vdso_gettimeofday
__gettimeofday
ACE_OS::gettimeofday()
Signed-off-by: Vladimir Nikulichev <nvs@tbricks.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvs
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Every event in the perf-kvm has a 'stats' structure, which contains
max/min/average/etc times of handling this event.
The problem is that the 'perf-kvm stat report' command always shows
that 'min time' is 0us for every event. Example:
# perf kvm stat report
Analyze events for all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
[..]
0xB2 MSCH 12 0.07% 0.00% 0us 8us 7.31us ( +- 2.11% )
0xB2 CHSC 12 0.07% 0.00% 0us 18us 9.39us ( +- 9.49% )
0xB2 STPX 8 0.05% 0.00% 0us 2us 1.88us ( +- 7.18% )
0xB2 STSI 7 0.04% 0.00% 0us 44us 16.49us ( +- 38.20% )
[..]
This happens because the 'stats' structure is not initialized and
stats->min equals to 0. Lets initialize the structure for every
event after its allocation using init_stats() function. This initializes
stats->min to -1 and makes 'Min time' statistics counting work:
# perf kvm stat report
Analyze events for all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
[..]
0xB2 MSCH 12 0.07% 0.00% 6us 8us 7.31us ( +- 2.11% )
0xB2 CHSC 12 0.07% 0.00% 7us 18us 9.39us ( +- 9.49% )
0xB2 STPX 8 0.05% 0.00% 1us 2us 1.88us ( +- 7.18% )
0xB2 STSI 7 0.04% 0.00% 1us 44us 16.49us ( +- 38.20% )
[..]
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1397053319-2130-3-git-send-email-borntraeger@de.ibm.com
[ Fixing the perf examples changelog output ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Add hist.percentage option for setting default value of the
symbol_conf.filter_relative. It affects the output of various perf
commands (like perf report, top and diff) only if filter(s) applied.
An user can write .perfconfig file like below to show absolute
percentage of filtered entries by default:
$ cat ~/.perfconfig
[hist]
percentage = absolute
And it can be changed through command line:
$ perf report --percentage relative
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>