In addition to using refcounts for the struct thread lifetime
management, we need to protect access to machine->threads from
concurrent access.
That happens in 'perf top', where a thread processes events, inserting
and deleting entries from that rb_tree while another thread decays
hist_entries, that end up dropping references and ultimately deleting
threads from the rb_tree and releasing its resources when no further
hist_entry (or other data structures, like in 'perf sched') references
it.
So the rule is the same for refcounts + protected trees in the kernel,
get the tree lock, find object, bump the refcount, drop the tree lock,
return, use object, drop the refcount if no more use of it is needed,
keep it if storing it in some other data structure, drop when releasing
that data structure.
I.e. pair "t = machine__find(new)_thread()" with a "thread__put(t)", and
"perf_event__preprocess_sample(&al)" with "addr_location__put(&al)".
The addr_location__put() one is because as we return references to
several data structures, we may end up adding more reference counting
for the other data structures and then we'll drop it at
addr_location__put() time.
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-bs9rt4n0jw3hi9f3zxyy3xln@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for the PERF_RECORD_AUX event type.
PERF_RECORD_AUX is a new kernel event that records when new data lands
in the AUX buffer. Currently it is assumed that AUX data follows the
same ring buffer conventions used by the perf events buffer, and
consequently the AUX event is not processed during recording.
It is processed during session processing so that the information in the
'flags' member is made available.
The format of PERF_RECORD_AUX is outlined in the linux/perf_events.h
header file. The 'flags' are also enumerated.
Intel PT and Intel BTS use the flag named PERF_AUX_FLAG_TRUNCATED to
determine if data has been lost because the buffer became full as perf
was not able to empty it fast enough.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1430404667-10593-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instruction tracing will typically have access to information about the
instruction being executed for a particular ip sample. Some of that
information will be available in the 'flags' member of struct
perf_sample.
With the addition of transactions events synthesis to Instruction
Tracing options, there is a need to be able easily to see the flags
because they show whether the ip is at the start, commit or abort of a
tranasaction.
Consequently add an option to display the flags.
The flags are "bcrosyiABEx" which stand for branch, call, return,
conditional, system, asynchronous, interrupt, transaction abort, trace
begin, trace end, and in transaction, respectively.
Example using Intel PT:
perf script -fip,time,event,sym,addr,flags
...
1288.721584105: branches:u: bo 401146 main => 401152 main
1288.721584105: transactions: x 0 401164 main
1288.721584105: branches:u: bx 40117c main => 40119b main
1288.721584105: branches:u: box 4011a4 main => 40117e main
1288.721584105: branches:u: bcx 401187 main => 401094 g
...
1288.721591645: branches:u: bx 4010c4 g => 4010cb g
1288.721591645: branches:u: brx 4010cc g => 401189 main
1288.721591645: transactions: 0 4011a6 main
1288.721593199: branches:u: b 4011a9 main => 4011af main
1288.721593199: branches:u: bo 4011bc main => 40113e main
1288.721593199: branches:u: b 401150 main => 40115a main
1288.721593199: transactions: x 0 401164 main
1288.721593199: branches:u: bx 40117c main => 40119b main
1288.721593199: branches:u: box 4011a4 main => 40117e main
1288.721593199: branches:u: bcx 401187 main => 40105e f
...
1288.722284747: branches:u: brx 401093 f => 401189 main
1288.722284747: branches:u: box 4011a4 main => 40117e main
1288.722284747: branches:u: bcx 401187 main => 40105e f
1288.722285883: transactions: bA 0 401071 f
1288.722285883: branches:u: bA 401071 f => 40116a main
1288.722285883: branches:u: bE 40116a main => 0 [unknown]
1288.722297174: branches:u: bB 0 [unknown] => 40116a main
...
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-26-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add two user events for AUX area tracing.
PERF_RECORD_AUXTRACE_INFO contains metadata, consisting primarily the
type of the AUX area tracing data plus some amount of
architecture-specific information. There should be only one
PERF_RECORD_AUXTRACE_INFO event.
PERF_RECORD_AUXTRACE identifies AUX area tracing data copied from the
mmapped AUX area tracing region. The actual data is not part of the
event but immediately follows it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-4-git-send-email-adrian.hunter@intel.com
[ s/MIN/min/g and use cast to fix up wrt -Werror=sign-compare till we adopt min_t() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As these can be obtained from the ordered_events pointer, via
container_of, reducing the cross section of ordered_samples.
These were added to ordered_samples in:
commit b7b61cbebd
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Tue Mar 3 11:58:45 2015 -0300
perf ordered_events: Shorten function signatures
By keeping pointers to machines, evlist and tool in ordered_events.
But that was more a transitional patch while moving stuff out from
perf_session.c to ordered_events.c and possibly not even needed by then,
as we could use the container_of() method and instead of having the
nr_unordered_samples stats in events_stats, we can have it in
ordered_samples.
Based-on-a-patch-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4lk0t9js82g0tfc0x1onpkjt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull leftover perf fixes from Ingo Molnar:
"Two perf fixes left over from the previous cycle"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf session: Do not fail on processing out of order event
x86/asm/traps: Disable tracing and kprobes in fixup_bad_iret and sync_regs
Add an index of the event identifiers, in preparation for Intel PT.
The event id (also called the sample id) is a unique number
allocated by the kernel to the event created by perf_event_open(). Events
can include the event id by having a sample type including PERF_SAMPLE_ID or
PERF_SAMPLE_IDENTIFIER.
Currently the main use of the event id is to match an event back to the
evsel to which it belongs i.e. perf_evlist__id2evsel()
The purpose of this patch is to make it possible to match an event back to
the mmap from which it was read. The reason that is useful is because the
mmap represents a time-ordered context (either for a cpu or for a thread).
Intel PT decodes trace information on that basis. In full-trace mode, that
information can be recorded when the Intel PT trace is read, but in
sample-mode the Intel PT trace data is embedded in a sample and it is in
that case that the "id index" is needed.
So the mmaps are numbered (idx) and the cpu and tid recorded against the id
by perf_evlist__set_sid_idx() which is called by perf_evlist__mmap_per_evsel().
That information is recorded on the perf.data file in the new "id index".
idx, cpu and tid are added to struct perf_sample_id (which is the node of
evlist's hash table to match ids to evsels). The information can be
retrieved using perf_evlist__id2sid(). Note however this all depends on
having a sample type including PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER,
otherwise ids are not recorded.
The "id index" is a synthesized event record which will be created when
Intel PT sampling is used by calling perf_event__synthesize_id_index().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1414417770-18602-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_evlist__mmap_read used 'union perf_event' as a placeholder for
event crossing the mmap boundary.
This is ok for sample shorter than ~PATH_MAX. However we could grow up
to the maximum sample size which is 16 bits max.
I hit this overflow issue when using 'perf top -G dwarf' which produces
sample with the size around 8192 bytes. We could configure any valid
sample size here using: '-G dwarf,size'.
Using array with sample max size instead for the event placeholder. Also
adding another safe check for the dynamic size of the user stack.
TODO: The 'struct perf_mmap' is quite big now, maybe we could use some
lazy allocation for event_copy size.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1380721599-24285-1-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds support for the new PERF_RECORD_MMAP2 record type
exposed by the kernel. This is an extended PERF_RECORD_MMAP record.
It adds for each file-backed mapping the device major, minor number and
the inode number and generation.
This triplet uniquely identifies the source of a file-backed mapping. It
can be used to detect identical virtual mappings between processes, for
instance.
The patch will prefer MMAP2 over MMAP.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1377079825-19057-3-git-send-email-eranian@google.com
[ Cope with 314add6 "Change machine__findnew_thread() to set thread pid",
fix 'perf test' regression test entry affected,
use perf_missing_features.mmap2 to fallback to not using .mmap2 in older kernels,
so that new tools can work with kernels where this feature is not present ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds the sorting and histogram support
functions to enable profiling of memory accesses.
The following sorting orders are added:
- symbol_daddr: data address symbol (or raw address)
- dso_daddr: data address shared object
- locked: access uses locked transaction
- tlb : TLB access
- mem : memory level of the access (L1, L2, L3, RAM, ...)
- snoop: access snoop mode
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-12-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed, the move of methods to
machine.[ch], and the rename of dsrc to data_src, to match the change
made in the PERF_SAMPLE_DSRC in a previous patch. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record has a new option -W that enables weightened sampling.
Add sorting support in top/report for the average weight per sample and the
total weight sum. This allows to both compare relative cost per event
and the total cost over the measurement period.
Add the necessary glue to perf report, record and the library.
v2: Merge with new hist refactoring.
v3: Fix manpage. Remove value check.
Rename global_weight to weight and weight to local_weight.
v4: Readd sort keys to manpage
v5: Move weight to end
v6: Move weight to template
v7: Rename weight key.
Original patch from Andi modified by Stephane Eranian <eranian@google.com>
to include ONLY the weight supporting code and apply to pristine 3.8.0-rc4.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1359040242-8269-6-git-send-email-eranian@google.com
[ committer note: changed to cope with fc5871ed and the hists_link perf test entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we were processing a PERF_RECORD_EXIT event we first used
machine__findnew_thread for both the thread exiting and for its parent,
only to use just the thread struct associated with the one exiting, and
to just delete it.
If it existed, i.e. not created at this very moment in
machine__findnew_thread, it will be moved to the machine->dead_threads
linked list, because we may have hist_entries pointing to it, but if it
was created just do be deleted, it will just sit there with no
references at all.
Use the new machine__find_thread() method so that if it is not there, we
don't create it.
As a bonus the parent thread will also not be created at this point.
Create process_fork() and process_exit() helpers to use this and make
the builtins use it instead of the generic process_task(), ditched by
this patch.
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-z7n2y98ebjyrvmytaope4vdl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On some systems (e.g. Android), ALIGN is defined in system headers as
ALIGN(p). The definition of ALIGN used in perf takes 2 parameters:
ALIGN(x,a). This leads to redefinition conflicts.
Redefinition error on Android:
In file included from util/include/linux/list.h:1:0,
from util/callchain.h:5,
from util/hist.h:6,
from util/session.h:4,
from util/build-id.h:4,
from util/annotate.c:11:
util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror]
bionic/libc/include/sys/param.h:38:0: note: this is the location of
the previous definition
Conflics with system defined ALIGN in Android:
util/event.c: In function 'perf_event__synthesize_comm':
util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1
util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function)
util/event.c:115:9: note: each undeclared identifier is reported only once for
each function it appears in
In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN.
Signed-off-by: Irina Tirdea <irina.tirdea@intel.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Irina Tirdea <irina.tirdea@intel.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1347315303-29906-5-git-send-email-irina.tirdea@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>