Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
by the kernel when fork, clone, setns or unshare are invoked. And update
perf-record documentation with the new option to record namespace
events.
Committer notes:
Combined it with a later patch to allow printing it via 'perf report -D'
and be able to test the feature introduced in this patch. Had to move
here also perf_ns__name(), that was introduced in another later patch.
Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
^
Testing it:
# perf record --namespaces -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
#
# perf report -D
<SNIP>
3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
0x1151e0 [0x30]: event: 9
.
. ... raw event: size 48 bytes
. 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
. 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
. 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................
<SNIP>
NAMESPACES events: 1
<SNIP>
#
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses the time members from the perf_event_mmap_page to convert
between TSC and perf time.
Due to a lack of foresight when Intel PT was implemented, those time
members were recorded in the (implementation dependent) AUXTRACE_INFO
event, the structure of which is generally inaccessible outside of the
Intel PT decoder. However now the conversion between TSC and perf time
is needed when processing a jitdump file when Intel PT has been used for
tracing.
So add a user event to record the time members. 'perf record' will
synthesize the event if the information is available. And session
processing will put a copy of the event on the session so that tools
like 'perf inject' can easily access it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the cpu_map event to pass/store cpu maps as data in
a pipe/perf.data.
We store maps in 2 formats:
- list of cpus
- mask of cpus
The format that takes less space is selected transparently in the
following patch.
The interface is made generic, so we could add the cpumap event data
into another event in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-8-git-send-email-jolsa@kernel.org
[ cpu_map_data_cpus -> cpu_map_entries, cpu_map_data_mask -> cpu_map_mask ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch modifies the perf tool to handle the new RECORD type,
PERF_RECORD_LOST_SAMPLES.
The number of lost-sample events is stored in
.nr_events[PERF_RECORD_LOST_SAMPLES]. The exact number of samples
which the kernel dropped is stored in total_lost_samples.
When the percentage of dropped samples is greater than 5%, a warning
is printed.
Here are some examples:
Eg 1, Recording different frequently-occurring events is safe with the
patch. Only a very low drop rate is associated with such actions.
$ perf record -e '{cycles:p,instructions:p}' -c 20003 --no-time ~/tchain ~/tchain
$ perf report -D | tail
SAMPLE events: 120243
MMAP2 events: 5
LOST_SAMPLES events: 24
FINISHED_ROUND events: 15
cycles:p stats:
TOTAL events: 59348
SAMPLE events: 59348
instructions:p stats:
TOTAL events: 60895
SAMPLE events: 60895
$ perf report --stdio --group
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 24
#
# Samples: 120K of event 'anon group { cycles:p, instructions:p }'
# Event count (approx.): 24048600000
#
# Overhead Command Shared Object Symbol
# ................ ........... ................
..................................
#
99.74% 99.86% tchain_edit tchain_edit [.] f3
0.09% 0.02% tchain_edit tchain_edit [.] f2
0.04% 0.00% tchain_edit [kernel.vmlinux] [k] ixgbe_read_reg
Eg 2, Recording the same thing multiple times can lead to high drop
rate, but it is not a useful configuration.
$ perf record -e '{cycles:p,cycles:p}' -c 20003 --no-time ~/tchain
Warning: Processed 600592 samples and lost 99.73% samples!
[perf record: Woken up 148 times to write data]
[perf record: Captured and wrote 36.922 MB perf.data (1206322 samples)]
[perf record: Woken up 1 times to write data]
[perf record: Captured and wrote 0.121 MB perf.data (1629 samples)]
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1431285195-14269-9-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add support for the PERF_RECORD_AUX event type.
PERF_RECORD_AUX is a new kernel event that records when new data lands
in the AUX buffer. Currently it is assumed that AUX data follows the
same ring buffer conventions used by the perf events buffer, and
consequently the AUX event is not processed during recording.
It is processed during session processing so that the information in the
'flags' member is made available.
The format of PERF_RECORD_AUX is outlined in the linux/perf_events.h
header file. The 'flags' are also enumerated.
Intel PT and Intel BTS use the flag named PERF_AUX_FLAG_TRUNCATED to
determine if data has been lost because the buffer became full as perf
was not able to empty it fast enough.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1430404667-10593-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add two user events for AUX area tracing.
PERF_RECORD_AUXTRACE_INFO contains metadata, consisting primarily the
type of the AUX area tracing data plus some amount of
architecture-specific information. There should be only one
PERF_RECORD_AUXTRACE_INFO event.
PERF_RECORD_AUXTRACE identifies AUX area tracing data copied from the
mmapped AUX area tracing region. The actual data is not part of the
event but immediately follows it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1428594864-29309-4-git-send-email-adrian.hunter@intel.com
[ s/MIN/min/g and use cast to fix up wrt -Werror=sign-compare till we adopt min_t() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an index of the event identifiers, in preparation for Intel PT.
The event id (also called the sample id) is a unique number
allocated by the kernel to the event created by perf_event_open(). Events
can include the event id by having a sample type including PERF_SAMPLE_ID or
PERF_SAMPLE_IDENTIFIER.
Currently the main use of the event id is to match an event back to the
evsel to which it belongs i.e. perf_evlist__id2evsel()
The purpose of this patch is to make it possible to match an event back to
the mmap from which it was read. The reason that is useful is because the
mmap represents a time-ordered context (either for a cpu or for a thread).
Intel PT decodes trace information on that basis. In full-trace mode, that
information can be recorded when the Intel PT trace is read, but in
sample-mode the Intel PT trace data is embedded in a sample and it is in
that case that the "id index" is needed.
So the mmaps are numbered (idx) and the cpu and tid recorded against the id
by perf_evlist__set_sid_idx() which is called by perf_evlist__mmap_per_evsel().
That information is recorded on the perf.data file in the new "id index".
idx, cpu and tid are added to struct perf_sample_id (which is the node of
evlist's hash table to match ids to evsels). The information can be
retrieved using perf_evlist__id2sid(). Note however this all depends on
having a sample type including PERF_SAMPLE_ID or PERF_SAMPLE_IDENTIFIER,
otherwise ids are not recorded.
The "id index" is a synthesized event record which will be created when
Intel PT sampling is used by calling perf_event__synthesize_id_index().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1414417770-18602-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds support for the new PERF_RECORD_MMAP2 record type
exposed by the kernel. This is an extended PERF_RECORD_MMAP record.
It adds for each file-backed mapping the device major, minor number and
the inode number and generation.
This triplet uniquely identifies the source of a file-backed mapping. It
can be used to detect identical virtual mappings between processes, for
instance.
The patch will prefer MMAP2 over MMAP.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1377079825-19057-3-git-send-email-eranian@google.com
[ Cope with 314add6 "Change machine__findnew_thread() to set thread pid",
fix 'perf test' regression test entry affected,
use perf_missing_features.mmap2 to fallback to not using .mmap2 in older kernels,
so that new tools can work with kernels where this feature is not present ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>