Don't read broken data after 'head' pointer.
Following commits will feed perf_evlist__mmap_read() with some 'head'
pointers not maintained by kernel. If 'head' pointer breaks an event, we
should avoid reading from the broken event. This can happen in backward
ring buffer.
For example:
old head
| |
V V
+---+------+----------+----+-----+--+
|..E|D....D|C........C|B..B|A....|E.|
+---+------+----------+----+-----+--+
'old' pointer points to the beginning of 'A' and trying read from it,
but 'A' has been overwritten. In this case, don't try to read from 'A',
simply return NULL.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461637738-62722-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
write_buildid() increments 'name_len' with intention to take into
account trailing zero byte. However, 'name_len' was already incremented
in machine__write_buildid_table() before. So this leads to
out-of-bounds read in do_write():
$ ./perf record sleep 0
[ perf record: Woken up 1 times to write data ]
=================================================================
==15899==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000099fc92 at pc 0x7f1aa9c7eab5 bp 0x7fff940f84d0 sp 0x7fff940f7c78
READ of size 19 at 0x00000099fc92 thread T0
#0 0x7f1aa9c7eab4 (/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/libasan.so.2+0x44ab4)
#1 0x649c5b in do_write util/header.c:67
#2 0x649c5b in write_padded util/header.c:82
#3 0x57e8bc in write_buildid util/build-id.c:239
#4 0x57e8bc in machine__write_buildid_table util/build-id.c:278
...
0x00000099fc92 is located 0 bytes to the right of global variable '*.LC99' defined in 'util/symbol.c' (0x99fc80) of size 18
'*.LC99' is ascii string '[kernel.kallsyms]'
...
Shadow bytes around the buggy address:
0x00008012bf80: f9 f9 f9 f9 00 00 00 00 00 00 03 f9 f9 f9 f9 f9
=>0x00008012bf90: 00 00[02]f9 f9 f9 f9 f9 00 00 00 00 00 05 f9 f9
0x00008012bfa0: f9 f9 f9 f9 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461053847-5633-1-git-send-email-aryabinin@virtuozzo.com
[ Remove the off-by one at the origin, to keep len(s) == strlen(s) assumption ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
Build fixes:
- Fix 'perf trace' build when DWARF unwind isn't available (Arnaldo Carvalho de Melo)
- Remove x86 references from arch-neutral Build, fixing it in !x86 arches,
reported as breaking the build for powerpc64le in linux-next (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Do memset() variable 'st' using the correct size in the jit code (Colin Ian King)
- Fix postgresql ubuntu 'perf script' install instructions (Chris Phlipot)
- Use callchain_param more thoroughly when checking how callchains were
configured, eventually will be the only way to look for callchain parameters
(Arnaldo Carvalho de Melo)
- Fix some issues in the 'perf test kallsyms' entry (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The current code is memsetting the 'struct stat' variable 'st' with the size of
'stat' (which turns out to be 1 byte) rather than the size of variable 'sz'.
Committer notes:
sizeof(function) isn't valid, the result depends on the compiler used, with
gcc, enabling pedantic warnings we get:
$ cat sizeof_function.c
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdio.h>
int main(void)
{
printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
return 0;
}
$ readelf -sW sizeof_function | grep -w stat
49: 0000000000400630 16 FUNC WEAK HIDDEN 13 stat
$ cc -pedantic sizeof_function.c -o sizeof_function
sizeof_function.c: In function ‘main’:
sizeof_function.c:8:46: warning: invalid application of ‘sizeof’ to a function type [-Wpointer-arith]
printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
^
$ ./sizeof_function
sizeof(stat)=1, stat=0x400630
$
Standard C, section 6.5.3.4:
"The sizeof operator shall not be applied to an expression that has function
type or an incomplete type, to the parenthesized name of such a type,
or to an expression that designates a bit-field member."
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Link: http://lkml.kernel.org/r/1461020838-9260-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tracing a workload that uses transactions gave a seg fault as follows:
perf record -e intel_pt// workload
perf report
Program received signal SIGSEGV, Segmentation fault.
0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
at util/intel-pt.c:929
929 ptq->last_branch_rb->nr = 0;
(gdb) p ptq->last_branch_rb
$1 = (struct branch_stack *) 0x0
(gdb) up
1148 intel_pt_reset_last_branch_rb(ptq);
(gdb) l
1143 if (ret)
1144 pr_err("Intel Processor Trace: failed to deliver transaction event
1145 ret);
1146
1147 if (pt->synth_opts.callchain)
1148 intel_pt_reset_last_branch_rb(ptq);
1149
1150 return ret;
1151 }
1152
(gdb) p pt->synth_opts.callchain
$2 = true
(gdb)
(gdb) bt
#0 0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
#1 0x000000000054c1e0 in intel_pt_synth_transaction_sample (ptq=0x1a36110)
#2 0x000000000054c5b2 in intel_pt_sample (ptq=0x1a36110)
Caused by checking the 'callchain' flag when it should have been the
'last_branch' flag. Fix that.
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: f14445ee72 ("perf intel-pt: Support generating branch stack")
Link: http://lkml.kernel.org/r/1460977068-11566-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
# perf test -v python
16: Try 'import perf' in python, checking link problems :
--- start ---
test child forked, pid 672
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
symbol_conf
test child finished with -1
---- end ----
Try 'import perf' in python, checking link problems: FAILED!
#
To fix it just pass a parameter to perf_evsel__fprintf_sym telling if
callchains should be printed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-comrsr20bsnr8bg0n6rfwv12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The recent perf_evsel__fprintf_callchain() move to evsel.c added several
new symbol requirements to the python binding, for instance:
# perf test -v python
16: Try 'import perf' in python, checking link problems :
--- start ---
test child forked, pid 18030
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
callchain_cursor
test child finished with -1
---- end ----
Try 'import perf' in python, checking link problems: FAILED!
#
This would require linking against callchain.c to access to the global
callchain_cursor variables.
Since lots of functions already receive as a parameter a
callchain_cursor struct pointer, make that be the case for some more
function so that we can start phasing out usage of yet another global
variable.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements from Arnaldo Carvalho de Melo:
User visible changes:
- Automagically create a 'bpf-output' event, easing the setup of BPF
C "scripts" that produce output via the perf ring buffer. Now it is
just a matter of calling any perf tool, such as 'trace', with a C
source file that references the __bpf_stdout__ output channel and
that channel will be created and connected to the script:
# trace -e nanosleep --event test_bpf_stdout.c usleep 1
0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40 ) ...
0.013 ( ): __bpf_stdout__:Raise a BPF event!..)
0.015 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
0.261 ( ): __bpf_stdout__:Raise a BPF event!..)
0.262 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
0.264 ( 0.264 ms): usleep/2818 ... [continued]: nanosleep()) = 0
#
Further work is needed to reduce the number of lines in a perf bpf C source
file, this being the part where we greatly reduce the command line setup (Wang Nan)
- 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using
libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI
processing. This reduces the overhead by asking just for userspace callchains
and also only for the syscall exit tracepoint (raw_syscalls:sys_exit)
(Milian Wolff, Arnaldo Carvalho de Melo)
Try it with, for instance:
# perf trace --call dwarf ping 127.0.0.1
An excerpt of a system wide 'perf trace --call dwarf" session is at:
https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt
You may need to bump the number of mmap pages, using -m/--mmap-pages,
but on a Broadwell machine the defaults allowed system wide tracing to
work without losing that many records, experiment with just some
syscalls, like:
# perf trace --call dwarf -e nanosleep,futex
All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu,
etc) should work. Also --duration may be interesting to try.
To get filenames from in various syscalls pointer args (open, ettc), add this
to the mix:
# perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
Making this work is next in line:
# trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1
I.e. honouring per-tracepoint callchains in 'perf trace' in addition to
in raw_syscalls:sys_exit.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Beautify more syscall arguments in 'perf trace', using the type column in
tracepoint /format fields to attach, for instance, a pid_t resolver to the
thread COMM, also attach a mode_t beautifier in the same fashion
(Arnaldo Carvalho de Melo)
- Build the syscall table id <-> name resolver using the same .tbl file
used in the kernel to generate headers, to avoid the delay in getting
new syscalls supported in the audit-libs external dependency, done so
far only for x86_64 (Arnaldo Carvalho de Melo)
- Improve the documentation of event specifications (Andi Kleen)
- Process update events in 'perf script', fixing up this use case:
# perf stat -a -I 1000 -e cycles record | perf script -s script.py
- Shared object symbol adjustment fixes, fixing symbol resolution in
Android (Wang Nan)
Infrastructure changes:
- Add dedicated unwind addr_space member into thread struct, to allow
tools to use thread->priv, noticed while working on having callchains
in 'perf trace' (Jiri Olsa)
Build fixes:
- Fix the build in Ubuntu 12.04 (Arnaldo Carvalho de Melo, Vinson Lee)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The fprintf_sym() and fprintf_callchain() methods now allow users to
change the existing behaviour of showing "[unknown]" as the name of
unresolved symbols to instead show "[0x123456]", i.e. its address.
The current patch doesn't change tools to use this facility, the results
from 'perf trace' and 'perf script' cotinue like:
70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
[unknown] (/usr/lib64/libc-2.22.so)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
start_thread+0xca (/usr/lib64/libpthread-2.22.so)
__clone+0x6d (/usr/lib64/libc-2.22.so)
The next patch will make 'perf trace' use the new formatting.
Suggested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch allows cloning bpf-output event configuration among multiple
bpf scripts. If there exist a map named '__bpf_output__' and not
configured using 'map:__bpf_output__.event=', this patch clones the
configuration of another '__bpf_stdout__' map. For example, following
command:
# perf trace --ev bpf-output/no-inherit,name=evt/ \
--ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
--ev ./test_bpf_trace2.c usleep 100000
equals to:
# perf trace --ev bpf-output/no-inherit,name=evt/ \
--ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
--ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \
usleep 100000
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when parsing tracepoint event definitions, to
avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it
instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>