build_id_cache__kallsyms_path() accepts a string buffer but also allocs
a buffer using asnprintf. Unfortunately, the its only user passes it a
stack-allocated buffer. Freeing it causes crashes like this:
$ perf script
*** Error in `/home/wangnan/perf': free(): invalid pointer: 0x00007fffffff9630 ***
======= Backtrace: =========
lib64/libc.so.6(+0x6eeef)[0x7ffff5dbaeef]
lib64/libc.so.6(+0x78cae)[0x7ffff5dc4cae]
lib64/libc.so.6(+0x79987)[0x7ffff5dc5987]
/home/w00229757/perf(build_id_cache__kallsyms_path+0x6b)[0x49681b]
/home/w00229757/perf[0x4bdd40]
/home/w00229757/perf(dso__load+0xa3a)[0x4c048a]
/home/w00229757/perf(map__load+0x6f)[0x4d561f]
/home/w00229757/perf(thread__find_addr_map+0x235)[0x49e935]
/home/w00229757/perf(machine__resolve+0x7d)[0x49ec6d]
/home/w00229757/perf[0x4555a8]
/home/w00229757/perf[0x4d9507]
/home/w00229757/perf[0x4d9e80]
/home/w00229757/perf(ordered_events__flush+0x354)[0x4dd444]
/home/w00229757/perf(perf_session__process_events+0x3d0)[0x4dc140]
/home/w00229757/perf(cmd_script+0x12b0)[0x4592e0]
/home/w00229757/perf[0x4911f1]
/home/w00229757/perf(main+0x68f)[0x4352ef]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7ffff5d6dbd5]
/home/w00229757/perf[0x435415]
======= Memory map: ========
This patch simplifies build_id_cache__kallsyms_path(), not even
considering allocating a string buffer, so never frees anything. Its
caller should manage memory allocation.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Fixes: 01412261d9 ("perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid")
Link: http://lkml.kernel.org/r/1465271678-7392-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To try to, over time, consistently use the IS_ERR() interface instead of
using two return values, i.e. the integer return value for an error and
the pointer address to return the bpf_map->priv pointer.
Also rename it to bpf__priv(), to leave the "get" term for reference
counting.
Noticed while working on using BPF for collecting non-integer syscall
argument payloads (struct sockaddr in calls such as connect(), for
instance), where we need to use BPF maps and thus generalise
bpf__setup_stdout() to connect bpf_output events with maps in a bpf
proggie.
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-saypxyd6ptrct379jqgxx4bl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If a config file has wrong key-value pairs, the perf process will be
forcibly terminated by die() at perf_parse_file() called by
perf_config() so terminal settings can be crushed because of unusual
termination.
For example:
If user config file has a wrong value 'red;default' instead of a normal
value like 'red, default' for a key 'colors.top',
# cat ~/.perfconfig
[colors]
medium = red;default # wrong value
and if running sub-command 'top',
# perf top
perf process is dead by force and terminal setting is broken
with a messge like below.
Fatal: bad config file line 2 in /root/.perfconfig
So fix it.
If perf_config() can return on failure without calling die()
at perf_parse_file(), this problem can be solved.
And if a config file has wrong values, show the error message
and then use default config values instead of wrong config values.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1465210380-26749-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement the TopDown formulas in 'perf stat'. The topdown basic metrics
reported by the kernel are collected, and the formulas are computed and
output as normal metrics.
See the kernel commit exporting the events for details on the used
metrics.
Committer note:
Output example:
# perf stat --topdown -a usleep 1
Performance counter stats for 'system wide':
retiring bad speculation frontend bound backend bound
S0-C0 2 23.8% 11.6% 28.3% 36.3%
S0-C1 2 16.2% 15.7% 36.5% 31.6%
0.000579956 seconds time elapsed
#
v2: Always print all metrics, only use thresholds for coloring.
v3: Mark retiring over threshold green, not red.
v4: Only print one decimal digit
Fix color printing of one metric
v5: Avoid printing -0.0
v6: Remove extra frontend event lookup
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add basic plumbing for TopDown in perf stat
TopDown is intended to replace the frontend cycles idle/ backend cycles
idle metrics in standard perf stat output. These metrics are not
reliable in many workloads, due to out of order effects.
This implements a new --topdown mode in perf stat (similar to
--transaction) that measures the pipe line bottlenecks using
standardized formulas. The measurement can be all done with 5 counters
(one fixed counter)
The result are four metrics:
FrontendBound, BackendBound, BadSpeculation, Retiring
that describe the CPU pipeline behavior on a high level.
The full top down methology has many hierarchical metrics. This
implementation only supports level 1 which can be collected without
multiplexing. A full implementation of top down on top of perf is
available in pmu-tools toplev. (http://github.com/andikleen/pmu-tools)
The current version works on Intel Core CPUs starting with Sandy Bridge,
and Atom CPUs starting with Silvermont. In principle the generic
metrics should be also implementable on other out of order CPUs.
TopDown level 1 uses a set of abstracted metrics which are generic to
out of order CPU cores (although some CPUs may not implement all of
them):
topdown-total-slots Available slots in the pipeline
topdown-slots-issued Slots issued into the pipeline
topdown-slots-retired Slots successfully retired
topdown-fetch-bubbles Pipeline gaps in the frontend
topdown-recovery-bubbles Pipeline gaps during recovery
from misspeculation
These metrics then allow to compute four useful metrics:
FrontendBound, BackendBound, Retiring, BadSpeculation.
Add a new --topdown options to enable events. When --topdown is
specified set up events for all topdown events supported by the kernel.
Add topdown-* as a special case to the event parser, as is needed for
all events containing -.
The actual code to compute the metrics is in follow-on patches.
v2: Use standard sysctl read function.
v3: Move x86 specific code to arch/
v4: Enable --metric-only implicitly for topdown.
v5: Add --single-thread option to not force per core mode
v6: Fix output order of topdown metrics
v7: Allow combining with -d
v8: Remove --single-thread again
v9: Rename functions, adding arch_ and topdown_.
v10: Expand man page and describe TopDown better
Paste intro into commit description.
Print error when malloc fails.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464119559-17203-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no need to receive events from overwritable ring buffer.
Instead, perf should make them run in background until some external
event of interest takes place. This patch makes ignores normal events from
overwrite evlists.
Overwritable events must be mapped readonly and backward, so if evlist
and evsel doesn't match (evsel->overwrite is true but either evlist is
read/write or evlist is not backward, and vice versa), skip mapping it.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464056944-166978-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch, a simple 'perf record' could fail if kptr_restrict is
set to 1 (for normal user) or 2 (for root):
# perf record ls
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Segmentation fault (core dumped)
This patch skips perf_event__synthesize_kernel_mmap() when kptr is not
available.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 45e9005690 ("perf machine: Do not bail out if not managing to read ref reloc symbol")
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf updates from Ingo Molnar:
"Mostly tooling and PMU driver fixes, but also a number of late updates
such as the reworking of the call-chain size limiting logic to make
call-graph recording more robust, plus tooling side changes for the
new 'backwards ring-buffer' extension to the perf ring-buffer"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
perf record: Read from backward ring buffer
perf record: Rename variable to make code clear
perf record: Prevent reading invalid data in record__mmap_read
perf evlist: Add API to pause/resume
perf trace: Use the ptr->name beautifier as default for "filename" args
perf trace: Use the fd->name beautifier as default for "fd" args
perf report: Add srcline_from/to branch sort keys
perf evsel: Record fd into perf_mmap
perf evsel: Add overwrite attribute and check write_backward
perf tools: Set buildid dir under symfs when --symfs is provided
perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
perf annotate: Sort list of recognised instructions
perf annotate: Fix identification of ARM blt and bls instructions
perf tools: Fix usage of max_stack sysctl
perf callchain: Stop validating callchains by the max_stack sysctl
perf trace: Fix exit_group() formatting
perf top: Use machine->kptr_restrict_warned
perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
perf machine: Do not bail out if not managing to read ref reloc symbol
perf/x86/intel/p4: Trival indentation fix, remove space
...
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- We should not use the current value of the kernel.perf_event_max_stack as the
default value for --max-stack in tools that can process perf.data files, they
will only match if that sysctl wasn't changed from its default value at the
time the perf.data file was recorded, fix it.
This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report'
produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo)
- Provide a better warning when running 'perf trace' on a system where the
kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record',
noticed on ubuntu 16.04 where this is the default kptr_restrict setting.
(Arnaldo Carvalho de Melo)
- Fix ordering of instructions in the annotation code, noticed when annotating
ARM binaries, now that table is auto-ordered at first use, to avoid more such
problems (Chris Ryder)
- Set buildid dir under symfs when --symfs is provided (He Kuang)
- Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo)
- Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being
traced (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull powerpc updates from Michael Ellerman:
"Highlights:
- Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
- Live patching support for ppc64le (also merged via livepatching.git)
Various cleanups & minor fixes from:
- Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie,
Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring,
Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras,
Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung
Bauermann, Valentin Rothberg, Vipin K Parashar.
General:
- Update LMB associativity index during DLPAR add/remove from Nathan
Fontenot
- Fix branching to OOL handlers in relocatable kernel from Hari Bathini
- Add support for userspace Power9 copy/paste from Chris Smart
- Always use STRICT_MM_TYPECHECKS from Michael Ellerman
- Add mask of possible MMU features from Michael Ellerman
PCI:
- Enable pass through of NVLink to guests from Alexey Kardashevskiy
- Cleanups in preparation for powernv PCI hotplug from Gavin Shan
- Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
- Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
- Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
from Guilherme G Piccoli
- Remove the dependency on EEH struct in DDW mechanism from Guilherme
G Piccoli
selftests:
- Test cp_abort during context switch from Chris Smart
- Add several tests for transactional memory support from Rashmica
Gupta
perf:
- Add support for sampling interrupt register state from Anju T
- Add support for unwinding perf-stackdump from Chandan Kumar
cxl:
- Configure the PSL for two CAPI ports on POWER8NVL from Philippe
Bergheaud
- Allow initialization on timebase sync failures from Frederic Barrat
- Increase timeout for detection of AFU mmio hang from Frederic
Barrat
- Handle num_of_processes larger than can fit in the SPA from Ian
Munsie
- Ensure PSL interrupt is configured for contexts with no AFU IRQs
from Ian Munsie
- Add kernel API to allow a context to operate with relocate disabled
from Ian Munsie
- Check periodically the coherent platform function's state from
Christophe Lombard
Freescale:
- Updates from Scott: "Contains 86xx fixes, minor device tree fixes,
an erratum workaround, and a kconfig dependency fix."
* tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits)
powerpc/86xx: Fix PCI interrupt map definition
powerpc/86xx: Move pci1 definition to the include file
powerpc/fsl: Fix build of the dtb embedded kernel images
powerpc/fsl: Fix rcpm compatible string
powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC
powerpc/fsl-pci: Add a workaround for PCI 5 errata
powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb
powerpc/powernv/npu: Add PE to PHB's list
powerpc/powernv: Fix insufficient memory allocation
powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
powerpc/powernv/npu: Enable NVLink pass through
powerpc/powernv/npu: Rework TCE Kill handling
powerpc/powernv/npu: Add set/unset window helpers
powerpc/powernv/ioda2: Export debug helper pe_level_printk()
...
This means the user can't access /proc/kallsyms, for instance, because
/proc/sys/kernel/kptr_restrict is set to 1.
Instead leave the ref_reloc_sym as NULL and code using it will cope.
This allows 'perf trace' to work on such systems for !root, the only
issue would be when trying to resolve kernel symbols, which happens,
for instance, in some libtracevent plugins. A warning for that case
will be provided in the next patch in this series.
Noticed in Ubuntu 16.04, that comes with kptr_restrict=1.
Reported-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-knpu3z4iyp2dxpdfm798fac4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>