'perf record' and 'perf report --dump-raw-trace' supported in this
release.
Example usage:
# perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
# perf report --dump-raw-trace
Note that the perf.data file is portable, so the report can be run on
another architecture host if necessary.
Output will contain raw SPE data and its textual representation, such
as:
0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1
.
. ... ARM SPE data: size 2097152 bytes
. 00000000: 49 00 LD
. 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0
. 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1
. 00000014: 9a 00 00 LAT 0 XLAT
. 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1
. 00000022: 98 00 00 LAT 0 TOT
. 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000002e: 49 00 LD
. 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80
. 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1
. 00000042: 9a 00 00 LAT 0 XLAT
. 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1
. 00000050: 98 00 00 LAT 0 TOT
. 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000005c: 48 00 INSN-OTHER
. 0000005e: 42 02 EV RETIRED
. 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1
. 00000069: 98 00 00 LAT 0 TOT
. 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481
...
Other release notes:
- applies to acme's perf/{core,urgent} branches, likely elsewhere
- Report is self-contained within the tool.
Record requires enabling the kernel SPE driver by
setting CONFIG_ARM_SPE_PMU.
- The intel-bts implementation was used as a starting point; its
min/default/max buffer sizes and power of 2 pages granularity need to be
revisited for ARM SPE
- Recording across multiple SPE clusters/domains not supported
- Snapshot support (record -S), and conversion to native perf events
(e.g., via 'perf inject --itrace'), are also not supported
- Technically both cs-etm and spe can be used simultaneously, however
disabled for simplicity in this release
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a compile error:
...
CC util/libunwind/x86_32.o
In file included from util/libunwind/x86_32.c:33:0:
util/libunwind/../../arch/x86/util/unwind-libunwind.c: In function 'libunwind__x86_reg_id':
util/libunwind/../../arch/x86/util/unwind-libunwind.c:110:11: error: 'EINVAL' undeclared (first use in this function)
return -EINVAL;
^
util/libunwind/../../arch/x86/util/unwind-libunwind.c:110:11: note: each undeclared identifier is reported only once for each function it appears in
mv: cannot stat 'util/libunwind/.x86_32.o.tmp': No such file or directory
make[4]: *** [util/libunwind/x86_32.o] Error 1
make[3]: *** [util] Error 2
make[2]: *** [libperf-in.o] Error 2
make[1]: *** [sub-make] Error 2
make: *** [all] Error 2
It happens when libunwind-x86 feature is detected.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20171206015040.114574-1-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On an arm64 machine running a CONFIG_RANDOMIZE_BASE=y kernel, perf
kernel symbol resolution fails. Debugging saw symsrc_init calling the
default elf__needs_adjust_symbols() where checks for an ET_DYN (3)
ehdr.e_type failed when they should have succeeded.
Fix by adopting powerpc version of the weak elf__needs_adjust_symbols()
function, as done in commit d233209833 ("perf probe ppc: Fix symbol
fixup issues due to ELF type").
Prior to this patch, perf test 1 would fail:
$ sudo oldperf test -v 1 |& head
1: vmlinux symtab matches kallsyms :
test child forked, pid 33374
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/boot/vmlinux for symbols
ERR : 0xfffe0000100f1000: do_undefinstr not on kallsyms
ERR : 0xfffe0000100f1320: do_sysinstr not on kallsyms
ERR : 0xfffe0000100f13b0: do_debug_exception not on kallsyms
ERR : 0xfffe0000100f1498: do_mem_abort not on kallsyms
ERR : 0xfffe0000100f1580: do_sp_pc_abort not on kallsyms
...
After applying this patch, perf test 1 now succeeds:
$ sudo ./newperf test -v 1 |& head
1: vmlinux symtab matches kallsyms :
test child forked, pid 33378
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/boot/vmlinux for symbols
WARN: 0xffff000008081000: diff name v: do_undefinstr k: __exception_text_start
WARN: 0xffff0000080819e8: diff name v: __irqentry_text_end k: __softirqentry_text_start
WARN: 0xffff000008081d08: diff name v: __entry_text_start k: __softirqentry_text_end
WARN: 0xffff00000809db5c: diff name v: flush_icache_range k: __flush_cache_user_range
WARN: 0xffff000008101908: diff name v: sys_ni_syscall k: sys_vm86old
...
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20171214175242.e30450f17f93ad675d968fa3@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit d80406453a ("perf symbols: Allow user probes on versioned
symbols") allows user to find default versioned symbols (with "@@") in
map. However, it did not enable normal versioned symbol (with "@") for
perf-probe. E.g.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc_get_state
Failed to find symbol malloc_get_state in /usr/lib64/libc-2.25.so
Error: Failed to add events.
=====
This solves above issue by improving perf-probe symbol search function,
as below.
=====
# ./perf probe -x /lib64/libc-2.25.so malloc_get_state
Added new event:
probe_libc:malloc_get_state (on malloc_get_state in /usr/lib64/libc-2.25.so)
You can now use it in all perf tools, such as:
perf record -e probe_libc:malloc_get_state -aR sleep 1
# ./perf probe -l
probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /usr/lib64/libc-2.25.so)
=====
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: bhargavb <bhargavaramudu@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Link: http://lkml.kernel.org/r/151275049269.24652.1639103455496216255.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Long ago we decided to be verbotten including files in the kernel git
sources from tools/ living source code, to avoid disturbing kernel
development (and perf's and other tools/) when, say, a kernel hacker
adds something, tests everything but tools/ and have tools/ build
broken.
This got broken recently by s/390, fix it by copying
arch/s390/include/uapi/asm/perf_regs.h to tools/arch/s390/include/uapi/asm/,
making this one be used by means of <asm/perf_regs.h> and updating
tools/perf/check_headers.sh to make sure we are notified when the
original changes, so that we can check if anything is needed on the
tooling side.
This would have been caught by the 'tarkpg' test entry in:
$ make -C tools/perf build-test
When run on a s/390 build system or container.
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: f704ef4460 ("s390/perf: add support for perf_regs and libdw")
Link: https://lkml.kernel.org/n/tip-n57139ic0v9uffx8wdqi3d8a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The regs_query_register_offset() helper function converts
register name like "%r0" to an offset of a register in user_pt_regs
It is required by the BPF prologue generator.
The user_pt_regs structure was recently added to "asm/ptrace.h".
Hence, update tools/perf/check-headers.sh to keep the header file
in sync with kernel changes.
Suggested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
This patch fixes a bug introduced with commit d9f8dfa9ba ("perf
annotate s390: Implement jump types for perf annotate").
'perf annotate' displays annotated assembler output by reading output of
command objdump and parsing the disassembled lines. For each shown
mnemonic this function sequence is executed:
disasm_line__new()
|
+--> disasm_line__init_ins()
|
+--> ins__find()
|
+--> arch->associate_instruction_ops()
The s390x specific function assigned to function pointer
associate_instruction_ops refers to function s390__associate_ins_ops().
This function checks for supported mnemonics and assigns a NULL pointer
to unsupported mnemonics. However even the NULL pointer is added to the
architecture dependend instruction array.
This leads to an extremely large architecture instruction array
(due to array resize logic in function arch__grow_instructions()).
Depending on the objdump output being parsed the array can end up
with several ten-thousand elements.
This patch checks if a mnemonic is supported and only adds supported
ones into the architecture instruction array. The array does not contain
elements with NULL pointers anymore.
Before the patch (With some debug printf output):
[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxbb
real 8m49.679s
user 7m13.008s
sys 0m1.649s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
/tmp/xxxbb | tail -1
__ins__find sorted:1 nr_instructions:87433 ins:0x341583c0
[root@s35lp76 perf]#
The number of different s390x branch/jump/call/return instructions
entered into the array is 87433.
After the patch (With some printf debug output:)
[root@s35lp76 perf]# time ./perf annotate --stdio > /tmp/xxxaa
real 1m24.553s
user 0m0.587s
sys 0m1.530s
[root@s35lp76 perf]# fgrep '__ins__find sorted:1 nr_instructions:'
/tmp/xxxaa | tail -1
__ins__find sorted:1 nr_instructions:56 ins:0x3f406570
[root@s35lp76 perf]#
The number of different s390x branch/jump/call/return instructions
entered into the array is 56 which is sensible.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20171124094637.55558-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For correct unwinding of user space processes, the floating-point
register contents are required. For example, leaf functions might
use fp registers to temporarily store the return address.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
With support for perf_regs and libdw, you can record and report
call graphs for user space programs. Simply invoke perf with
the --call-graph=dwarf command line option.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[brueckner: added dwfl_thread_state_register_pc() call]
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-and-tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Perf tool need implement a callback to enable using AUX buffer. Perf
will do another mmap() to trigger the setup of AUX buffer in kernel
if there is such callback. The default size of the AUX buffer is set
properly according to the sampling frequency to avoid overflow. It
could also be manually set by -m option of perf.
The interface of perf is not changed. Diagnostic mode sampling
could be started by `perf record -e rBD000` like before.
Signed-off-by: Pu Hou <bjhoupu@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull char/misc driver updates from Greg KH:
"Here is the big char/misc driver update for 4.14-rc1.
Lots of different stuff in here, it's been an active development cycle
for some reason. Highlights are:
- updated binder driver, this brings binder up to date with what
shipped in the Android O release, plus some more changes that
happened since then that are in the Android development trees.
- coresight updates and fixes
- mux driver file renames to be a bit "nicer"
- intel_th driver updates
- normal set of hyper-v updates and changes
- small fpga subsystem and driver updates
- lots of const code changes all over the driver trees
- extcon driver updates
- fmc driver subsystem upadates
- w1 subsystem minor reworks and new features and drivers added
- spmi driver updates
Plus a smattering of other minor driver updates and fixes.
All of these have been in linux-next with no reported issues for a
while"
* tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (244 commits)
ANDROID: binder: don't queue async transactions to thread.
ANDROID: binder: don't enqueue death notifications to thread todo.
ANDROID: binder: Don't BUG_ON(!spin_is_locked()).
ANDROID: binder: Add BINDER_GET_NODE_DEBUG_INFO ioctl
ANDROID: binder: push new transactions to waiting threads.
ANDROID: binder: remove proc waitqueue
android: binder: Add page usage in binder stats
android: binder: fixup crash introduced by moving buffer hdr
drivers: w1: add hwmon temp support for w1_therm
drivers: w1: refactor w1_slave_show to make the temp reading functionality separate
drivers: w1: add hwmon support structures
eeprom: idt_89hpesx: Support both ACPI and OF probing
mcb: Fix an error handling path in 'chameleon_parse_cells()'
MCB: add support for SC31 to mcb-lpc
mux: make device_type const
char: virtio: constify attribute_group structures.
Documentation/ABI: document the nvmem sysfs files
lkdtm: fix spelling mistake: "incremeted" -> "incremented"
perf: cs-etm: Fix ETMv4 CONFIGR entry in perf.data file
nvmem: include linux/err.h from header
...
The value passed into the perf.data file for the CONFIGR register in ETMv4
was incorrectly being set to the command line options/ETMv3 value.
Adds bit definitions and function to remap this value to the correct ETMv4
CONFIGR bit values for all selected options.
Signed-off-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'perf report' tool does not display the addresses of kernel module
symbols correctly.
For example symbol qeth_send_ipa_cmd in kernel module qeth.ko has this
relative address for function qeth_send_ipa_cmd():
[root@s8360047 linux]# nm -g drivers/s390/net/qeth.ko | fgrep send_ipa_cmd
0000000000013088 T qeth_send_ipa_cmd
The module is loaded at address:
[root@s8360047 linux]# cat /sys/module/qeth/sections/.text
0x000003ff80296d20
[root@s8360047 linux]#
This should result in a start address of:
0x13088 + 0x3ff80296d20 = 0x3ff802a9da8
Using crash to verify the address on a live system:
[root@s8360046 linux]# crash vmlinux
crash 7.1.9++
Copyright (C) 2002-2016 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
[...]
crash> mod -s qeth drivers/s390/net/qeth.ko
MODULE NAME SIZE OBJECT FILE
3ff8028d700 qeth 151552 drivers/s390/net/qeth.ko
crash> sym qeth_send_ipa_cmd
3ff802a9da8 (T) qeth_send_ipa_cmd [qeth] /root/linux/drivers/s390/net/qeth_core_main.c: 2944
crash>
Now perf report displays the address of symbol qeth_send_ipa_cmd:
symbol__new:
qeth_send_ipa_cmd 0x130f0-0x132ce
There is a difference of 0x68 between the entry in the symbol table (see
nm command above) and perf. The difference is from the offset the .text
segment of qeth.ko:
[root@s8360047 perf]# readelf -a drivers/s390/net/qeth.ko
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .note.gnu.build-i NOTE 0000000000000000 00000040
0000000000000024 0000000000000000 A 0 0 4
[ 2] .text PROGBITS 0000000000000000 00000068
000000000001c8a0 0000000000000000 AX 0 0 8
As seen the .text segment has an offset of 0x68 with start address 0x0.
Therefore 0x68 is added to the address of qeth_send_ipa_cmd and thus
0x13088 + 0x68 = 0x130f0 is displayed.
This is wrong, perf report needs to display the start address of symbol
qeth_send_ipa_cmd at 0x13088 + qeth.ko.text section start address.
The qeth.ko module .text start address is available in the qeth.ko DSO
map. Just identify the kernel module symbols and correct the addresses.
With the fix I see this correct address for symbol: symbol__new:
qeth_send_ipa_cmd 0x3ff802a9da8-0x3ff802a9f86
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
LPU-Reference: 20170803134902.47207-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-q8lktlpoxb5e3dj52u1s1rw4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On s390x the kernel text segment starts at address 0x0. When perf
report reads kernel symbols from vmlinux file it adds an offset of
0x1000.
For example see symbol set_reset_devices:
[root@s8360047 linux-devel]# nm -A vmlinux| fgrep set_reset_devices
vmlinux:0000000001379000 t set_reset_devices
[root@s8360047 linux-devel]#
[root@s8360047 linux-devel]# fgrep set_reset_devices /proc/kallsyms
0000000001379000 t set_reset_devices
[root@s8360047 linux-devel]#
The kernel symbol table and the vmlinux file have the same address for
symbol set_reset_devices namely 1379000.
When perf report reads this symbols it displays it with address
symbol__new: set_reset_devices 0x137a000-0x137a018
There is a difference between perf report and vmlinux of 0x1000.
The reason for the difference is at kernel symbol load time in function
dso__load_sym(). The vmlinux file is investigated with its ELF header.
Command readelf shows this:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .text PROGBITS 0000000000000000 00001000
0000000000b0e0c2 0000000000000000 AX 0 0 128
This leads to an invalid calculation of the symbol start address, see
file utit/symbol-elf.c line 974:
/* Adjust symbol to map to file offset */
if (adjust_kernel_syms)
sym.st_value -= shdr.sh_addr - shdr.sh_offset;
With shdr.sh_addr set to 0x0 and shdr.sh_offset set to 0x1000 as read
from the ELF .text section 0x1000 is added to the symbol address.
I would like to fix this by introducing an archticture specific function
named elf__needs_adjust_symbols(). This is the same approach as done by
PowerPC. The function currently does not exist for s390x and the
default weak one is used. The s390x specific one returns false when
symsrc_init() is invoked for kernel symbols and results in variable
adjust_kernel_syms being false. This omits the adjustment and the
correct address is displayed (when symbol resolvement does not work).
The s390x specific function returns false for kernel symbol adjustment
and returns true for kernel modules, processes and shared libraries.
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
LPU-Reference: 20170713130252.6167-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
An earlier kernel patch allowed enabling PT and LBR at the same time on
Goldmont.
commit ccbebba4c6 ("perf/x86/intel/pt: Bypass PT vs. LBR exclusivity
if the core supports it")
However, users still cannot use Intel PT and LBRs simultaneously. $
sudo perf record -e cycles,intel_pt//u -b -- sleep 1 Error: PMU
Hardware doesn't support sampling/overflow-interrupts.
PT implicitly adds dummy event in perf tool. dummy event is software
event which doesn't support LBR.
Always setting no branch for dummy event in Intel PT.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170630141656.1626-2-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Macro fusion merges two instructions to a single micro-op. Intel core
platform performs this hardware optimization under limited
circumstances.
For example, CMP + JCC can be "fused" and executed /retired together.
While with sampling this can result in the sample sometimes being on the
JCC and sometimes on the CMP. So for the fused instruction pair, they
could be considered together.
On Nehalem, fused instruction pairs:
cmp/test + jcc.
On other new CPU:
cmp/test/add/sub/and/inc/dec + jcc.
This patch adds an x86-specific function which checks if 2 instructions
are in a "fused" pair. For non-x86 arch, the function is just NULL.
Changelog:
v4: Move the CPU model checking to symbol__disassemble and save the CPU
family/model in arch structure.
It avoids checking every time when jump arrow printed.
v3: Add checking for Nehalem (CMP, TEST). For other newer Intel CPUs
just check it by default (CMP, TEST, ADD, SUB, AND, INC, DEC).
v2: Remove the original weak function. Arnaldo points out that doing it
as a weak function that will be overridden by the host arch doesn't
work. So now it's implemented as an arch-specific function.
Committer fix:
Do not access evsel->evlist->env->cpuid, ->env can be null, introduce
perf_evsel__env_cpuid(), just like perf_evsel__env_arch(), also used in
this function call.
The original patch was segfaulting 'perf top' + annotation.
But this essentially disables this fused instructions augmentation in
'perf top', the right thing is to get the cpuid from the running kernel,
left for a later patch tho.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1499403995-19857-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add ptwrite to the op code map and the perf tools new instructions test.
To run the test:
$ tools/perf/perf test "x86 ins"
39: Test x86 instruction decoder - new instructions : Ok
Or to see the details:
$ tools/perf/perf test -v "x86 ins" 2>&1 | grep ptwrite
For information about ptwrite, refer the Intel SDM.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: http://lkml.kernel.org/r/1495180230-19367-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Porting PPC to libdw only needs an architecture-specific hook to move
the register state from perf to libdw.
The ARM and x86 architectures already use libdw, and it is useful to
have as much common code for the unwinder as possible. Mark Wielaard
has contributed a frame-based unwinder to libdw, so that unwinding works
even for binaries that do not have CFI information. In addition,
libunwind is always preferred to libdw by the build machinery so this
cannot introduce regressions on machines that have both libunwind and
libdw installed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Milian Wolff <milian.wolff@kdab.com>
Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1496312681-20133-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>