Use struct mmap_cpu_mask type for the tool's thread and mmap data
buffers to overcome current 1024 CPUs mask size limitation of cpu_set_t
type.
Currently glibc's cpu_set_t type has an internal mask size limit of 1024
CPUs.
Moving to the 'struct mmap_cpu_mask' type allows overcoming that limit.
The tools bitmap API is used to manipulate objects of 'struct mmap_cpu_mask'
type.
Committer notes:
To print the 'nbits' struct member we must use %zd, since it is a
size_t, this fixes the build in some toolchains/arches.
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/96d7e2ff-ce8b-c1e0-d52c-aa59ea96f0ea@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Restructure the event opening in perf stat to cycle through the events
by CPU after setting affinity to that CPU.
This eliminates IPI overhead in the perf API.
We have to loop through the CPU in the outter builtin-stat code instead
of leaving that to low level functions.
It has to change the weak group fallback strategy slightly. Since we
cannot easily undo the opens for other CPUs move the weak group retry to
a separate loop.
Before with a large test case with 94 CPUs:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
42.75 4.050910 67 60046 110 perf_event_open
After:
26.86 0.944396 16 58069 110 perf_event_open
(the number changes slightly because the weak group retries
work differently and the test case relies on weak groups)
Committer notes:
Added one of the hunks in a patch provided by Andi after I noticed that
the "event times" 'perf test' entry was segfaulting.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191121001522.180827-10-andi@firstfloor.org
Link: http://lore.kernel.org/lkml/20191127232657.GL84886@tassilo.jf.intel.com # Fix
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
At the end of a 'perf record' session, by default, we'll process all
samples and populate the threads, maps, etc so as to find out which of
the DSOs got samples, to reduce the size of the build-id table we'll
add to the perf.data headers.
But we don't need to process the PERF_RECORD_MMAP events synthesized
for the kernel modules, as we have those already via
perf_session__create_kernel_maps(), so add mmap/mmap2 handlers that
first look at event->header.misc to see if the event is for a user map,
bailing out if not.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-mofoxvcx2dryppcw3o689jdd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The patch adds a new option to limit the output file size, then based on
it, we can create a wrapper of the perf command that uses the option to
avoid exhausting the disk space by the unconscious user.
In order to make the perf.data parsable, we just limit the sample data
size, since the perf.data consists of many headers and sample data and
other data, the actual size of the recorded file will bigger than the
setting value.
Testing it:
# ./perf record -a -g --max-size=10M
Couldn't synthesize bpf events.
[ perf record: perf size limit reached (10249 KB), stopping session ]
[ perf record: Woken up 32 times to write data ]
[ perf record: Captured and wrote 10.133 MB perf.data (71964 samples) ]
# ls -lh perf.data
-rw------- 1 root root 11M Oct 22 14:32 perf.data
# ./perf record -a -g --max-size=10K
[ perf record: perf size limit reached (10 KB), stopping session ]
Couldn't synthesize bpf events.
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 1.546 MB perf.data (69 samples) ]
# ls -l perf.data
-rw------- 1 root root 1626952 Oct 22 14:36 perf.data
Committer notes:
Fixed the build in multiple distros by using PRIu64 to print u64 struct
members, fixing this:
builtin-record.c: In function 'record__write':
builtin-record.c:150:5: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=]
rec->bytes_written >> 10);
^
CC /tmp/build/pe
Signed-off-by: Jiwei Sun <jiwei.sun@windriver.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Danter <richard.danter@windriver.com>
Link: http://lore.kernel.org/lkml/20191022080901.3841-1-jiwei.sun@windriver.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support directory output that contains a regular perf.data file, named
"data". By default the directory is named perf.data i.e.
perf.data
└── data
Most of the infrastructure to support a directory is already there. This
patch makes the changes needed to support the format above.
Presently there is no 'perf record' option to output a directory.
This is preparation for adding support for putting a copy of /proc/kcore in
the directory.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191004083121.12182-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__put() from tools/perf to libperf.
Once perf_mmap__put() is moved, we need a way to call application
related unmap code (AIO and aux related code for eprf), when the map
goes away.
Add the perf_mmap::unmap callback to do that.
The unmap path from perf is:
perf_mmap__put (libperf)
perf_mmap__munmap (libperf)
map->unmap_cb -> perf_mmap__unmap_cb (perf)
mmap__munmap (perf)
Committer notes:
Add missing linux/kernel.h to tools/perf/lib/mmap.c to get the BUG_ON
definition.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As this isn't used at all in mmap.h but in evlist.h, so to cut down the
header dependency tree, move it to where it is used.
Also add mmap.h to the places using it but previously getting it
indirectly via evlist.h.
Add missing pthread.h to evlist.h, as it has a pthread_t struct member
and was getting the header via mmap.h.
Noticed while processing a Jiri's libperf batch touching mmap.h, where
almost everything gets rebuilt because evlist.h is so popular, so cut
down't this rebuild the world party.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/n/tip-he0uljeftl0xfveh3d6vtode@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
[acme@quaco ~]$ perf record -b -e cycles date
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Mon 23 Sep 2019 11:00:59 AM -03
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.005 MB perf.data (14 samples) ]
[acme@quaco ~]$
But we did a fallback and exclude_kernel was set, so no need for
resolving kernel symbols:
$ perf evlist -v
cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
$
After:
[acme@quaco ~]$ perf record -b -e cycles date
Mon 23 Sep 2019 11:07:18 AM -03
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB perf.data (16 samples) ]
[acme@quaco ~]$ perf evlist -v
cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
[acme@quaco ~]$
No needless warning is emitted.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lkml.kernel.org/n/tip-5yqnr8xcqwhr15xktj2097ac@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is sometimes useful to generate a snapshot when perf record exits;
I've been using a wrapper script around the workload that would do a
killall -USR2 perf when the workload exits.
This patch makes it easier and also works when perf record is attached
to a pre-existing task. A new snapshot option 'e' can be specified in
-S to enable this behavior:
root@elsewhere:~# perf record -e intel_pt// -Se sleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.085 MB perf.data ]
Co-developed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190806144101.62892-1-alexander.shishkin@linux.intel.com
[ Fixed up !HAVE_AUXTRACE_SUPPORT build in builtin-record.c, adding 2 missing __maybe_unused ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.
Committer notes:
Fixed up these:
tools/perf/arch/arm/util/auxtrace.c
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/arm-spe.c
tools/perf/arch/s390/util/auxtrace.c
tools/perf/util/cs-etm.c
Also
cc1: warnings being treated as errors
tests/sample-parsing.c: In function 'do_test':
tests/sample-parsing.c:162: error: missing initializer
tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')
struct evsel evsel = {
.needs_swap = false,
- .core.attr = {
- .sample_type = sample_type,
- .read_format = read_format,
+ .core = {
+ . attr = {
+ .sample_type = sample_type,
+ .read_format = read_format,
+ },
[perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
gcc (GCC) 4.4.7
Also we don't need to include perf_event.h in
tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
perf_event_attr' is enough. And this even fixes the build in some
systems where things are used somewhere down the include path from
perf_event.h without defining __always_inline.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In places where the equivalent was already being done, i.e.:
free(a);
a = NULL;
And in placs where struct members are being freed so that if we have
some erroneous reference to its struct, then accesses to freed members
will result in segfaults, which we can detect faster than use after free
to areas that may still have something seemingly valid.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One can just record callchains in the kernel or user space with this new
options.
We can use it together with "--all-kernel" options.
This two options is used just like print_stack(sys) or print_ustack(usr)
for systemtap.
Shown below is the usage of this new option combined with "--all-kernel"
options:
1. Configure all used events to run in kernel space and just collect
kernel callchains.
$ perf record -a -g --all-kernel --kernel-callchains
2. Configure all used events to run in kernel space and just collect
user callchains.
$ perf record -a -g --all-kernel --user-callchains
Committer notes:
Improved documentation to state that asking for kernel callchains really
is asking for excluding user callchains, and vice versa.
Further mentioned that using both won't get both, but nothing, as both
will be excluded.
Signed-off-by: yuzhoujian <yuzhoujian@didichuxing.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1559222962-22891-1-git-send-email-ufo19890607@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>