Currently sym and dso require printing ip and addr because the print
function is tied to those outputs. With callindent it makes sense to
print the symbol or dso without numerical IP or ADDR. So change the
dependency check to only check the underlying attribute.
Also the branch target output relies on the user_set flag to determine
if the branch target should be implicitely printed. When modifying the
fields with + or - also set user_set, so that ADDR can be removed. We
also need to set wildcard_set to make the initial sanity check pass.
This allows to remove a lot of noise in callindent output by dropping
the numerical addresses, which are not all that useful.
Before
% perf script --itrace=cr -F +callindent
swapper 0 [000] 156546.354971: 1 branches: pt_config 0 [unknown] ([unknown]) => ffffffff81010486 pt_config ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: pt_config ffffffff81010499 pt_config ([kernel.kallsyms]) => ffffffff8101063e pt_event_add ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: pt_event_add ffffffff81010635 pt_event_add ([kernel.kallsyms]) => ffffffff8115e687 event_sched_in.isra.107 ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_enable ffffffff8115e726 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff811579b0 perf_pmu_enable ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_void ffffffff81151730 perf_pmu_nop_void ([kernel.kallsyms]) => ffffffff8115e72b event_sched_in.isra.107 ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: event_sched_in.isra.107 ffffffff8115e737 event_sched_in.isra.107 ([kernel.kallsyms]) => ffffffff8115e7a5 group_sched_in ([kernel.kallsyms])
swapper 0 [000] 156546.354971: 1 branches: __x86_indirect_thunk_rax ffffffff8115e7f6 group_sched_in ([kernel.kallsyms]) => ffffffff81a03000 __x86_indirect_thunk_rax ([kernel.kallsyms])
After
% perf script --itrace=cr -F +callindent,-ip,-sym,-symoff
swapper 0 [000] 156546.354971: 1 branches: pt_config
swapper 0 [000] 156546.354971: 1 branches: pt_config
swapper 0 [000] 156546.354971: 1 branches: pt_event_add
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_enable
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_void
swapper 0 [000] 156546.354971: 1 branches: event_sched_in.isra.107
swapper 0 [000] 156546.354971: 1 branches: __x86_indirect_thunk_rax
swapper 0 [000] 156546.354971: 1 branches: perf_pmu_nop_int
swapper 0 [000] 156546.354971: 1 branches: group_sched_in
swapper 0 [000] 156546.354971: 1 branches: event_filter_match
swapper 0 [000] 156546.354971: 1 branches: event_filter_match
swapper 0 [000] 156546.354971: 1 branches: group_sched_in
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180918123214.26728-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There isn't subcommand `version` when typing `perf help`.
Before :
$ perf help | grep version
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
So add perf-version in command-list.txt for listing it when typing `perf
help`.
After :
$ perf help | grep version
usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS]
version display the version of perf binary
Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180919074911.41931-1-qpakzk@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When building in ClearLinux using 'make PYTHON=python3' with gcc 8.2.1
it fails with:
GEN /tmp/build/perf/python/perf.so
In file included from /usr/include/python3.7m/Python.h:126,
from /git/linux/tools/perf/util/python.c:2:
/usr/include/python3.7m/import.h:58:24: error: redundant redeclaration of ‘_PyImport_AddModuleObject’ [-Werror=redundant-decls]
PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *, PyObject *);
^~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/python3.7m/import.h:47:24: note: previous declaration of ‘_PyImport_AddModuleObject’ was here
PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *name,
^~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
error: command 'gcc' failed with exit status 1
And indeed there is a redundant declaration in that Python.h file, one
with parameter names and the other without, so just add
-Wno-error=redundant-decls to the python setup instructions.
Now perf builds with gcc in ClearLinux with the following Dockerfile:
# docker.io/acmel/linux-perf-tools-build-clearlinux:latest
FROM docker.io/clearlinux:latest
MAINTAINER Arnaldo Carvalho de Melo <acme@kernel.org>
RUN swupd update && \
swupd bundle-add sysadmin-basic-dev
RUN mkdir -m 777 -p /git /tmp/build/perf /tmp/build/objtool /tmp/build/linux && \
groupadd -r perfbuilder && \
useradd -m -r -g perfbuilder perfbuilder && \
chown -R perfbuilder.perfbuilder /tmp/build/ /git/
USER perfbuilder
COPY rx_and_build.sh /
ENV EXTRA_MAKE_ARGS=PYTHON=python3
ENTRYPOINT ["/rx_and_build.sh"]
Now to figure out why the build fails with clang, that is present in the
above container as detected by the rx_and_build.sh script:
clang version 6.0.1 (tags/RELEASE_601/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/sbin
make: Entering directory '/git/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libslang: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
Makefile.config:331: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
make[1]: *** [Makefile.perf:206: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/git/linux/tools/perf'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thiago Macieira <thiago.macieira@intel.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-c3khb9ac86s00qxzjrueomme@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When ordering events, we use preallocated buffers to store separate
events. Those buffers currently don't have their own struct, but since
they are basically an array of 'struct ordered_event' objects, we use
the first event to hold buffers data - list head, that holds all buffers
together:
struct ordered_events {
...
struct ordered_event *buffer;
...
};
struct ordered_event {
u64 timestamp;
u64 file_offset;
union perf_event *event;
struct list_head list;
};
This is quite convoluted and error prone as demonstrated by free-ing
issue discovered and fixed by Stephane in here [1].
This patch adds the 'struct ordered_events_buffer' object, that holds
the buffer data and frees it up properly.
[1] - https://marc.info/?l=linux-kernel&m=153376761329335&w=2
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Stephane Eranian <eranian@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180907102455.7030-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It has been pointed out to me many times that it is useful to be able to
switch off AUX records to save the bandwidth for records that actually
matter, for example, in AUX overwrite mode.
The usefulness of PERF_RECORD_AUX is in some of its flags, like the
TRUNCATED flag that tells the decoder where exactly gaps in the trace
are. The OVERWRITE flag, on the other hand will be set on every single
record in overwrite mode. However, a PERF_RECORD_AUX[flags=OVERWRITE] is
generated on every target task's sched_out, which over time adds up to a
lot of useless information.
If any folks out there have userspace that depends on a constant stream
of OVERWRITE records for a good reason, they'll have to let us know.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Markus T Metzger <markus.t.metzger@intel.com>
Link: http://lkml.kernel.org/r/20180404145323.28651-1-alexander.shishkin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Fix finding a symbol by name when multiple maps use the same backing DSO,
so we must first see if that symbol name is in the DSO, then see if it is
inside the range of addresses for that specific map (Adrian Hunter)
- Update the tools copies of UAPI headers, which silences the warnings
emitted when building the tools and in some cases, like for the new
KVM ioctls, results in 'perf trace' being able to translate that
ioctl number to a string (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 1c5aae7710 ("perf machine: Create maps for x86 PTI entry
trampolines") revealed a problem with maps__find_symbol_by_name() that
resulted in probes not being found e.g.
$ sudo perf probe xsk_mmap
xsk_mmap is out of .text, skip it.
Probe point 'xsk_mmap' not found.
Error: Failed to add events.
maps__find_symbol_by_name() can optionally return the map of the found
symbol. It can get the map wrong because, in fact, the symbol is found
on the map's dso, not allowing for the possibility that the dso has more
than one map. Fix by always checking the map contains the symbol.
Reported-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 1c5aae7710 ("perf machine: Create maps for x86 PTI entry trampolines")
Link: http://lkml.kernel.org/r/20180907085116.25782-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To get the changes in:
c48300c92a ("vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition")
This makes 'perf trace' and other tools in the future using its
beautifiers in a libbeauty.so library be able to translate these new
ioctl to strings:
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > /tmp/after
$ diff -u /tmp/before /tmp/after
--- /tmp/before 2018-09-11 13:10:57.923038244 -0300
+++ /tmp/after 2018-09-11 13:11:20.329012685 -0300
@@ -15,6 +15,7 @@
[0x22] = "SET_VRING_ERR",
[0x23] = "SET_VRING_BUSYLOOP_TIMEOUT",
[0x24] = "GET_VRING_BUSYLOOP_TIMEOUT",
+ [0x25] = "SET_BACKEND_FEATURES",
[0x30] = "NET_SET_BACKEND",
[0x40] = "SCSI_SET_ENDPOINT",
[0x41] = "SCSI_CLEAR_ENDPOINT",
@@ -27,4 +28,5 @@
static const char *vhost_virtio_ioctl_read_cmds[] = {
[0x00] = "GET_FEATURES",
[0x12] = "GET_VRING_BASE",
+ [0x26] = "GET_BACKEND_FEATURES",
};
$
We'll also use this to be able to express syscall filters using symbolic
these symbolic names, something like:
# perf trace --all-cpus -e ioctl(cmd=*GET_FEATURES)
This silences the following warning during perf's build:
Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-35x71oei2hdui9u0tarpimbq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Problem: perf did not show branch predicted/mispredicted bit in brstack.
Output of perf -F brstack for profile collected
Before:
0x4fdbcd/0x4fdc03/-/-/-/0
0x45f4c1/0x4fdba0/-/-/-/0
0x45f544/0x45f4bb/-/-/-/0
0x45f555/0x45f53c/-/-/-/0
0x7f66901cc24b/0x45f555/-/-/-/0
0x7f66901cc22e/0x7f66901cc23d/-/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0
After:
0x4fdbcd/0x4fdc03/P/-/-/0
0x45f4c1/0x4fdba0/P/-/-/0
0x45f544/0x45f4bb/P/-/-/0
0x45f555/0x45f53c/P/-/-/0
0x7f66901cc24b/0x45f555/P/-/-/0
0x7f66901cc22e/0x7f66901cc23d/P/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0
Cause:
As mentioned in Software Development Manual vol 3, 17.4.8.1,
IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is
stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as
its format. Despite that, registers containing FROM address of the branch,
do have MISPREDICT bit but because of the format indicated in
IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit.
Solution:
Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit.
Signed-off-by: Jacek Tomaka <jacek.tomaka@poczta.fm>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180802013830.10600-1-jacekt@dugeo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
Kernel:
- Modify breakpoint fixes (Jiri Olsa)
perf annotate:
- Fix parsing aarch64 branch instructions after objdump update (Kim Phillips)
- Fix parsing indirect calls in 'perf annotate' (Martin Liška)
perf probe:
- Ignore SyS symbols irrespective of endianness on PowerPC (Sandipan Das)
perf trace:
- Fix include path for asm-generic/unistd.h on arm64 (Kim Phillips)
Core libraries:
- Fix potential null pointer dereference in perf_evsel__new_idx() (Hisao Tanabe)
- Use fixed size string for comms instead of scanf("%m"), that is
not present in the bionic libc and leads to a crash (Chris Phlipot)
- Fix bad memory access in trace info on 32-bit systems, we were reading
8 bytes from a 4-byte long variable when saving the command line in the
perf.data file. (Chris Phlipot)
Build system:
- Streamline bpf examples and headers installation, clarifying
some install messages. (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Prevent multiplication result truncation on 32bit. Introduced with
the early timestamp reworrk.
- Ensure microcode revision storage to be consistent under all
circumstances
- Prevent write tearing of PTEs
- Prevent confusion of user and kernel reegisters when dumping fatal
signals verbosely
- Make an error return value in a failure path of the vector
allocation negative. Returning EINVAL might the caller assume
success and causes further wreckage.
- A trivial kernel doc warning fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Use WRITE_ONCE() when setting PTEs
x86/apic/vector: Make error return value negative
x86/process: Don't mix user/kernel regs in 64bit __show_regs()
x86/tsc: Prevent result truncation on 32bit
x86: Fix kernel-doc atomic.h warnings
x86/microcode: Update the new microcode revision unconditionally
x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
Pull timekeeping fixes from Thomas Gleixner:
"Two fixes for timekeeping:
- Revert to the previous kthread based update, which is unfortunately
required due to lock ordering issues. The removal caused boot
failures on old Core2 machines. Add a proper comment why the thread
needs to stay to prevent accidental removal in the future.
- Fix a silly typo in a function declaration"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Revert "Remove kthread"
timekeeping: Fix declaration of read_persistent_wall_and_boot_offset()
Pull irqchip fix from Thomas Gleixner:
"A single fix to prevent allocating excessive memory in the GIC/ITS
driver.
While the subject of the patch might suggest otherwise this is a real
fix as some SoCs exceed the memory allocation limits and fail to boot"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/gic-v3-its: Cap lpi_id_bits to reduce memory footprint
Pull cpu hotplug fixes from Thomas Gleixner:
"Two fixes for the hotplug state machine code:
- Move the misplaces smb() in the hotplug thread function to the
proper place, otherwise a half update control struct could be
observed
- Prevent state corruption on error rollback, which causes the state
to advance by one and as a consequence skip it in the bringup
sequence"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Prevent state corruption on error rollback
cpu/hotplug: Adjust misplaced smb() in cpuhp_thread_fun()
Pull random driver fix from Ted Ts'o:
"Fix things so the choice of whether or not to trust RDRAND to
initialize the CRNG is configurable via the boot option
random.trust_cpu={on,off}"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: make CPU trust a boot parameter