[The kernel patch needed for this is in tip now (b16a5b52eb perf/x86:
Add option to disable ...) So this user tools patch to make use of it
should be merged now]
Automatically disable collecting branch flags and cycles with
--call-graph lbr. This allows avoiding a bunch of extra MSR
reads in the PMI on Skylake.
When the kernel doesn't support the new flags they are automatically
cleared in the fallback code.
v2: Switch to use branch_sample_type instead of sample_type.
Adjust description.
Fix the fallback logic.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1449879144-29074-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We should always return from thread__new(), the constructor, with the
object with a reference count of one, so that:
struct thread *thread = thread__new();
thread__put(thread);
Will call thread__delete().
If any reference is made to that 'thread' variable, it better use
thread__get(thread) to hold a reference.
We were returning with thread->refcnt set to zero, fix it and some cases
where thread__delete() was being called, which were not a problem
because just one reference was being used, now that we set it to 1, use
thread__put() instead.
Reported-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4b9mkuk66to4ecckpmpvqx6s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The new name is irq_poll as iopoll is already taken. Better suggestions
welcome.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
There are so many test cases use stack allocated 'struct machine'.
Including:
test__hists_link
test__hists_filter
test__mmap_thread_lookup
test__thread_mg_share
test__hists_output
test__hists_cumulate
Also, in non-test code (for example, machine__new_host()) there are
code use 'malloc()' to alloc struct machine.
These are dangerous operations, cause some tests fail or hung in
machines__exit(). For example, in
machines__exit ->
machine__destroy_kernel_maps ->
map_groups__remove ->
maps__remove ->
pthread_rwlock_wrlock
a incorrectly initialized lock causes unintended behavior.
This patch memset(0) that structure in machine__init() to ensure all
fields in 'struct machine' are initialized to zero.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1449541544-67621-17-git-send-email-wangnan0@huawei.com
[ Use memset, see 'man bzero' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'he' cannot be NULL since it's caller hist_iter__top_callback() is
called only if iter->he is not NULL (see hist_entry_iter__add). So
setting 'sym' before the condition to simplify the code.
Also make it clearer that the top->symbol_filter_entry check is only
meaningful on stdio mode (i.e. when use_browser is 0).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449802616-16170-4-git-send-email-namhyung@kernel.org
[ Complete the simplification replacing one more he->ms.sym with sym ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ui__has_annotation() inside perf_top__record_precise_ip() should be
removed since it returns true only for TUI (and when sort key has
symbol). However the 'perf top --stdio' also supports annotation for a
symbol which was specified by 's' key action.
Actually it already does the necessary checks before calling the
function. So it's ok to get rid of the check here.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449802616-16170-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix dso__load_sym to put dso because dsos__add already got it.
Refcnt debugger explain the problem:
----
==== [0] ====
Unreclaimed dso: 0x19dd200
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(dso__load_sym+0xe89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be216]
./perf(dso__load_sym+0xee1) [0x503561]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__add+0xf3) [0x4a6bc3]
./perf(dso__load_sym+0xfc1) [0x503641]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 2 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(map_groups__exit+0xb9) [0x4bee29]
./perf(machine__delete+0xb0) [0x4b93d0]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b941e]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
----
So, in the dso__load_sym, dso is gotten 3 times, by dso__new,
map__new2, and dsos__add. The last 2 is actually released by
map_groups and machine__delete correspondingly. However, the
first reference by dso__new, is never released.
Committer note:
Changed the place where the reference count is dropped to:
Fix it by dropping it right after creating curr_map, since we know that
either that operation failed and we need to drop the dso refcount or
that it succeed and we have it referenced via curr_map->dso.
Then only drop the curr_map refcount after we call dsos__add() to make
sure we hold a reference to it via curr_map->dso.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021118.10245.49869.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Note that since the thread was already inserted to the session
list, it will be released when the session is released.
Also, in perf_session__register_idle_thread() failure path,
the thread should be put before returning.
Refcnt debugger shows that the perf_session__register_idle_thread
gets the returned thread, but the caller (__cmd_top) does not
put the returned idle thread.
----
==== [0] ====
Unreclaimed thread@0x24e6240
Refcount +1 => 0 at
./perf(thread__new+0xe5) [0x4c8a75]
./perf(machine__findnew_thread+0x9a) [0x4bbdba]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 1 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0xee) [0x4bbe0e]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 2 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0x112) [0x4bbe32]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021122.10245.69707.stgit@localhost.localdomain
[ Drop the refcount in perf_session__register_idle_thread() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix cmd_stat() to release cpu_map objects (aggr_map and
cpus_aggr_map) afterwards.
refcnt debugger shows that the cmd_stat initializes cpu_map
but not puts it.
----
# ./perf stat -v ls
....
REFCNT: BUG: Unreclaimed objects found.
==== [0] ====
Unreclaimed cpu_map@0x29339c0
Refcount +1 => 1 at
./perf(cpu_map__empty_new+0x6d) [0x4e64bd]
./perf(cmd_stat+0x5fe) [0x43594e]
./perf() [0x47b785]
./perf(main+0x617) [0x422587]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2dff420af5]
./perf() [0x4226fd]
REFCNT: Total 1 objects are not reclaimed.
"cpu_map" leaks 1 objects
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021127.10245.93697.stgit@localhost.localdomain
[ Remove NULL checks before calling the put operation, it checks it already ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Boris reported that 'perf top' is unusable on his default 'black on
white' terminal, which uses (eye friendly) light-grey as a background
color.
The reason is that the TUI cursor for the current selection line uses
HE_COLORSET_SELECTED, and that has a default background color of
'lightgrey' - which is a common terminal background choice and thus
the colors conflict.
Use yellow as the background color instead: that should be an uncommon
terminal background, yet it's still ergonomic on both black and
white/grey terminals.
[ It would be a better solution to straight out detect color
collisions and resolve them reasonably by converting them to RGB and
calculating color space distances, but I was unable to find
proper documentation for SLtt_get_color_object() to recover the
current color scheme so I gave up ... Yellow works well enough. ]
Reported-and-Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150305103213.GA23046@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add basic support to parse ARM assembly.
This:
* enables perf to correctly show the disassembly, rather than chopping
some constants off at the '#' (which is not a comment character on
ARM).
* allows perf to identify ARM instructions that branch to other parts
within the same function, thereby properly annotating them.
* allows perf to identify function calls, allowing called functions to
be followed in the annotated view.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/n/tip-owp1uj0nmcgfrlppfyeetuyf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we have 2 kinds of stat counters based on when the event is
enabled:
1) tracee command events, which are enable once the
tracee executes exec syscall (enable_on_exec bit)
2) all other events which get alive within the
perf_event_open syscall
And 2) case could raise a problem in case we want additional filter to
be attached for event. In this case we want the event to be enabled
after it's configured with filter.
Changing the behaviour of 2) events, so they all are created as disabled
(disabled bit). Adding extra enable call to make them alive once they
finish setup.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449133606-14429-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>