Current --branch-history LBR annotation displays confused data. For
example, each cycles report is duplicated on both "from" and "to"
entries.
For example:
perf report --branch-history --no-children --stdio
--2.32%--main div.c:39 (COND_BWD CROSS_2M predicted:49.7% cycles:1)
main div.c:44 (predicted:49.7% cycles:1)
main div.c:42 (RET CROSS_2M cycles:2)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (RET CROSS_2M cycles:1)
rand rand.c:28 (cycles:1)
rand rand.c:28 (RET CROSS_2M cycles:1)
__random random.c:298 (cycles:1)
__random random.c:297 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (RET CROSS_2M cycles:9)
The cycles should be tagged only on the "from". It's for the code block
that ends with "from", not for "to".
Another issue is the "predicted:49.7%" is duplicated too (tag on both
"from" and "to").
This patch tags the branch type/flag on "to" and tag the cycles on
"from".
For example:
--2.32%--main div.c:39 (COND_BWD CROSS_2M predicted:49.7%)
main div.c:44 (cycles:1)
main div.c:42 (RET CROSS_2M)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (RET CROSS_2M)
rand rand.c:28 (cycles:1)
rand rand.c:28 (RET CROSS_2M)
__random random.c:298 (cycles:1)
__random random.c:297 (COND_BWD CROSS_2M)
__random random.c:295 (cycles:1)
__random random.c:295 (COND_BWD CROSS_2M)
__random random.c:295 (cycles:1)
__random random.c:295 (RET CROSS_2M)
|
--2.23%--__random_r random_r.c:392 (cycles:9)
In this example, The "main div.c:39 (COND_BWD CROSS_2M predicted:49.7%)"
is "to" of branch and "main div.c:44 (cycles:1)" is "from" of branch.
It should be easier for understanding than before.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500894547-18411-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record -b -g <command>
perf report --branch-history
This merges the LBRs with the callgraphs.
However it would be nice if it also works without callgraphs (-g) set in
perf record, so that only the LBRs are displayed. But currently perf
report errors in this case. For example,
perf record -b <command>
perf report --branch-history
Error:
Selected -g or --branch-history but no callchain data. Did
you call 'perf record' without -g?
This patch displays the LBRs only even if callgraphs(-g) is not enabled
in perf record.
Change log:
v2: According to Milian Wolff's comment, change the obsolete error
message. Now the error message is:
┌─Error:─────────────────────────────────────┐
│Selected -g or --branch-history. │
│But no callchain or branch data. │
│Did you call 'perf record' without -g or -b?│
│ │
│ │
│Press any key... │
└────────────────────────────────────────────┘
When passing the last parameter to hists__fprintf,
changes "|" to "||".
hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
symbol_conf.use_callchain || symbol_conf.show_branchflag_count);
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1494240182-28899-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When converting from atomic_t to refcount_t we didn't follow the usual
step of initializing it to one before taking any new reference, which
trips over checking if taking a reference for a freed refcount_t, fix
it.
Brendan's report:
---
It's 4.12-rc7, with node v4.4.1. I'm building 4.13-rc1 now, as I hit
what I think is another unrelated perf bug and I'm starting to wonder
what else is broken on that version:
(root) /mnt/src/linux-4.12-rc7/tools/perf # ./perf record -F 99 -a -e
cpu-clock --cgroup=docker/f9e9d5df065b14646e8a11edc837a13877fd90c171137b2ba3feb67a0201cb65
-g
perf: /mnt/src/linux-4.12-rc7/tools/include/linux/refcount.h:108:
refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
Aborted
that used to work...
---
Testing it:
Before:
# perf stat -e cycles -C 0 --cgroup /
perf: /home/acme/git/linux/tools/include/linux/refcount.h:108: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
Aborted (core dumped)
#
After:
# perf stat -e cycles -C 0 --cgroup /
^C
Performance counter stats for 'CPU(s) 0':
132,081,393 cycles /
2.492942763 seconds time elapsed
#
Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Acked-by: Elena Reshetova <elena.reshetova@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sudeep Holla <Sudeep.Holla@arm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 79c5fe6db8 ("perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t")
Link: http://lkml.kernel.org/n/tip-l7ovfblq14ip2i08m1g0fkhv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To make it more clear that it is the sum of all the nr_samples fields in the
addr[] entries, i.e.:
sym_hist->nr_samples = sum(sym_hist->addr[0 .. symbol__size(sym)]->nr_samples)
Committer notes:
Taeung had renamed it to total_samples, but using nr_samples, as in the
added explanation above, looks clearer and establishes the direct
connection, making clear it is about the _number_ of samples.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1500500211-16599-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
struct sym_hist has addr[] but it should have not only number of samples
but also the sample period. So use new struct symhist_entry to pave the
way to have that.
Committer notes:
This initial patch will only introduce the struct sym_hist_entry and use
only the nr_samples member, which makes the code clearer and paves the
way to save the period as well.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1500500205-16553-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Show branch type in callchain entry. The branch type is printed
with other LBR information (such as cycles/abort/...).
For example:
perf record -g -j any,save_type
perf report --branch-history --stdio --no-children
38.50% div.c:45 [.] main div
|
---main div.c:42 (RET CROSS_2M cycles:2)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (RET CROSS_2M cycles:1)
rand rand.c:28 (cycles:1)
rand rand.c:28 (RET CROSS_2M cycles:1)
__random random.c:298 (cycles:1)
__random random.c:297 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (RET CROSS_2M cycles:9)
Change log
v6: Remove the branch_type_str() since it's moved to branch.c.
v5: Rewrite the branch info print code in util/callchain.c.
v4: Comparing to previous version, the major changes are:
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-8-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Show the branch type statistics at the end of perf report --stdio.
For example:
perf report --stdio
COND_FWD: 28.5%
COND_BWD: 9.4%
CROSS_4K: 0.7%
CROSS_2M: 14.1%
COND: 37.9%
UNCOND: 0.2%
IND: 6.7%
CALL: 26.5%
RET: 28.7%
SYSRET: 0.0%
The branch types are:
COND_FWD: conditional forward
COND_BWD: conditional backward
COND: conditional branch
UNCOND: unconditional branch
IND: indirect
CALL: function call
IND_CALL: indirect function call
RET: function return
SYSCALL: syscall
SYSRET: syscall return
COND_CALL: conditional function call
COND_RET: conditional function return
CROSS_4K and CROSS_2M:
They are the metrics checking for branches cross 4K or 2MB pages.
It's an approximate computing. We don't know if the area is 4K or
2MB, so always compute both.
To make the output simple, if a branch crosses 2M area, CROSS_4K
will not be incremented.
Change log
v7: Since the common branch type definitions are changed, some
tags/strings are updated accordingly.
v6: Remove branch_type_stat_display() since it's moved to branch.c.
v5: Remove the unnecessary sort__mode checking in
hist_iter__branch_callback().
v4: Comparing to previous version, the major changes are:
Add the computing of JCC forward/JCC backward and cross page checking
by using the from and to addresses.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-7-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create new util/branch.c and util/branch.h to contain the common branch
functions. Such as:
branch_type_count(): Count the numbers of branch types
branch_type_name() : Return the name of branch type
branch_type_stat_display(): Display branch type statistics info
branch_type_str(): Construct the branch type string.
The branch type is saved in branch_flags.
Change log:
v8: Change PERF_BR_NONE to PERF_BR_UNKNOWN.
v7: Since the common branch type name is changed (e.g. JCC->COND),
this patch is performed the modification accordingly.
v6: Move that multiline conditional code inside {} brackets.
Move branch_type_stat_display() from builtin-report.c to
branch.c.
Move branch_type_str() from callchain.c to branch.c.
v5: It's a new patch in v5 patch series.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-6-git-send-email-yao.jin@linux.intel.com
[ Don't use 'index' and 'stat' as names for variables, it shadows global decls in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The branch info such as predicted/cycles/... are printed at the
callchain entries.
For example: perf report --branch-history --no-children --stdio
--1.07%--main div.c:39 (predicted:52.4% cycles:1 iterations:17)
main div.c:44 (predicted:52.4% cycles:1)
main div.c:42 (cycles:2)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (cycles:1)
rand rand.c:28 (cycles:1)
rand rand.c:28 (cycles:1)
__random random.c:298 (cycles:1)
__random random.c:297 (cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (cycles:1)
But the current code is difficult to maintain and extend. This patch
refactors the code for easy maintenance.
Change log:
v6: 1. Put the multiline condition code into {} brackets in
counts_str_build()
2. Keep the original display order, that is:
predicted, abort, cycles, iterations
v5: It's a new patch in v5 patch series.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500379995-6449-5-git-send-email-yao.jin@linux.intel.com
[ Don't use 'index' as a name for a variable, it shadows a globa decl in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add header record types to pipe-mode, reusing the functions
used in file-mode and leveraging the new struct feat_fd.
For alignment, check that synthesized events don't exceed
pagesize.
Add the perf_event__synthesize_feature event call back to
process the new header records.
Before this patch:
$ perf record -o - -e cycles sleep 1 | perf report --stdio --header
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
...
After this patch:
$ perf record -o - -e cycles sleep 1 | perf report --stdio --header
# ========
# captured on: Mon May 22 16:33:43 2017
# ========
#
# hostname : my_hostname
# os release : 4.11.0-dbx-up_perf
# perf version : 4.11.rc6.g6277c80
# arch : x86_64
# nrcpus online : 72
# nrcpus avail : 72
# cpudesc : Intel(R) Xeon(R) CPU E5-2696 v3 @ 2.30GHz
# cpuid : GenuineIntel,6,63,2
# total memory : 263457192 kB
# cmdline : /root/perf record -o - -e cycles -c 100000 sleep 1
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# pmu mappings: intel_bts = 6, uncore_imc_4 = 22, uncore_sbox_1 = 47, uncore_cbox_5 = 33, uncore_ha_0 = 16, uncore_cbox
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
...
Support added for the subcommands: report, inject, annotate and script.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170718042549.145161-16-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are three FEAT_OP* macros:
- FEAT_OPA: for features without process record.
- FEAT_OPP: for features with process record.
- FEAT_OPF: like FEAT_OPP but to show only if show_full_info flags
is set.
To add pipe-mode headers we need yet another variation of the macros
(one to specify whether a feature generates an auxiliar record).
Instead, we redefine macros so that:
- show_full_info is specified as an argument (to remove the
FEAT_OPF variation) and,
- it always sets "process" handler (to remove the FEAT_OPA variation).
Individual process handlers can be NULLed individually.
This allows to define two variations only:
- FEAT_OPR: synthesizes auxiliar event record.
- FEAT_OPN: doesn't synthesize an auxiliar event record.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170718042549.145161-14-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If a process is in a mountns and has symbols in /tmp/perf-<pid>.map,
look first in the namespace using the tgid for the pidns that the
process might be in. If the map isn't found there, try looking in the
mountns where perf is running, and use the tgid that's appropriate for
perf's pid namespace. If all else fails, use the original pid.
This allows us to locate a symbol map file in the mount namespace, if it
was generated there. However, we also try the tool's /tmp in case it's
there instead.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1499305693-1599-3-git-send-email-kjlx@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For marking fused instructions clearly this patch adds a line before the
first instruction of pair and joins it with the arrow of the jump to its
target.
For example, when "je" is selected in annotate view, the line before
cmpl is displayed and joins the arrow of "je".
│ ┌──cmpl $0x0,argp_program_version_hook
81.93 │ ├──je 20
│ │ lock cmpxchg %esi,0x38a9a4(%rip)
│ │↓ jne 29
│ │↓ jmp 43
11.47 │20:└─→cmpxch %esi,0x38a999(%rip)
That means the cmpl+je is a fused instruction pair and they should be
considered together.
Changelog:
v3: Use Arnaldo's fix to improve the arrow origin rendering. To get the
evsel->evlist->env->cpuid, save the evsel in annotate_browser.
v2: new function "ins__is_fused" to check if the instructions are fused.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1499403995-19857-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Macro fusion merges two instructions to a single micro-op. Intel core
platform performs this hardware optimization under limited
circumstances.
For example, CMP + JCC can be "fused" and executed /retired together.
While with sampling this can result in the sample sometimes being on the
JCC and sometimes on the CMP. So for the fused instruction pair, they
could be considered together.
On Nehalem, fused instruction pairs:
cmp/test + jcc.
On other new CPU:
cmp/test/add/sub/and/inc/dec + jcc.
This patch adds an x86-specific function which checks if 2 instructions
are in a "fused" pair. For non-x86 arch, the function is just NULL.
Changelog:
v4: Move the CPU model checking to symbol__disassemble and save the CPU
family/model in arch structure.
It avoids checking every time when jump arrow printed.
v3: Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs
just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC).
v2: Remove the original weak function. Arnaldo points out that doing it
as a weak function that will be overridden by the host arch doesn't
work. So now it's implemented as an arch-specific function.
Committer fix:
Do not access evsel->evlist->env->cpuid, ->env can be null, introduce
perf_evsel__env_cpuid(), just like perf_evsel__env_arch(), also used in
this function call.
The original patch was segfaulting 'perf top' + annotation.
But this essentially disables this fused instructions augmentation in
'perf top', the right thing is to get the cpuid from the running kernel,
left for a later patch tho.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1499403995-19857-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Which is the case in S/390, where symbols were not being resolved
because machine__get_kernel_start was only setting machine->kernel_start
when the just successfully loaded kernel symtab had its map->start set
to !0, when it was left at (1ULL << 63) assuming a partitioning of the
address space for user/kernel, which is not the case in S/390 nor in
Sparc.
So just check if map__load() was successfull and set
machine->kernel_start to zero, fixing kernel symbol resolution on S/390.
Test performed by Thomas:
----
I like this patch. I have done a new build and removed all my debug output to start
from scratch. Without your patch I get this:
# Samples: 4 of event 'cpu-clock'
# Event count (approx.): 1000000
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................ ........................
75.00% 0.00% true [unknown] [k] 0x00000000004bedda
|
---0x4bedda
|
|--50.00%--0x42693a
| |
| --25.00%--0x2a72e0
| 0x2af0ca
| 0x3d1003fe4c0
|
--25.00%--0x4272bc
0x26fa84
and with your patch (I just rebuilt the perf tool, nothing else and used the same
perf.data file as input):
# Samples: 4 of event 'cpu-clock'
# Event count (approx.): 1000000
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... .......................... ..................................
75.00% 0.00% true [kernel.vmlinux] [k] pgm_check_handler
|
---pgm_check_handler
do_dat_exception
handle_mm_fault
__handle_mm_fault
filemap_map_pages
|
|--25.00%--rcu_read_lock_held
| rcu_lockdep_current_cpu_online
| 0x3d1003ff4c0
|
--25.00%--lock_release
Looks good to me....
----
Reported-and-Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Link: http://lkml.kernel.org/n/tip-dk0n1uzmbe0tbthrpfqlx6bz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When no event is specified perf will use the "cycles" hardware event
with the highest precision available in the processor, and excluding
kernel events for non-root users, so make that clear in the event name
by setting the "u" event modifier, i.e. "cycles:upp".
E.g.:
The default for root:
# perf record usleep 1
# perf evlist -v
cycles:ppp: ..., precise_ip: 3, exclude_kernel: 0, ...
#
And for !root:
$ perf record usleep 1
$ perf evlist -v
cycles:uppp: ... , precise_ip: 3, exclude_kernel: 1, ...
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lf29zcdl422i9knrgde0uwy3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>