Commit Graph

7289 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
39878d492c perf trace: Pretty print getrandom() args
# trace -e getrandom
  35622.560 ( 0.023 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  35622.585 ( 0.006 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  35622.594 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  35627.395 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK    ) = 16
  35630.940 ( 0.013 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK      ) = 16
  35718.613 ( 0.015 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  35718.629 ( 0.005 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  35718.637 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  35719.355 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK    ) = 16
  35721.042 ( 0.030 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK      ) = 16
  41090.830 ( 0.012 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  41090.845 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  41090.851 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  41091.750 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK    ) = 16
  41091.823 ( 0.006 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK      ) = 16
  41122.078 ( 0.053 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  41122.129 ( 0.009 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  41122.139 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16
  41124.492 ( 0.007 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK    ) = 16
  41124.470 ( 0.013 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK      ) = 16
  41590.832 ( 0.014 ms): chrome/5957 getrandom(buf: 0x7fabac7b15b0, count: 16, flags: NONBLOCK      ) = 16
  41590.884 ( 0.004 ms): chrome/5957 getrandom(buf: 0x7fabac7b15c0, count: 16, flags: NONBLOCK      ) = 16

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-gca0n1p3aca3depey703ph2q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31 10:42:23 -03:00
Arnaldo Carvalho de Melo
997bba8cf1 perf trace: Pretty print seccomp() args
E.g:

  # trace -e seccomp
   200.061 (0.009 ms): :2441/2441 seccomp(op: FILTER, flags: TSYNC                       ) = -1 EFAULT Bad address
   200.910 (0.121 ms): :2441/2441 seccomp(op: FILTER, flags: TSYNC, uargs: 0x7fff57479fe0) = 0

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-t369uckshlwp4evkks4bcoo7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31 10:42:22 -03:00
Arnaldo Carvalho de Melo
3ed5ca2eff perf trace: Do not process PERF_RECORD_LOST twice
We catch this record to provide a visual indication that events are
getting lost, then call the default method to allow extra logging shared
with the other tools to take place.

This extra logging was done twice because we were continuing to the
"default" clause where machine__process_event() will end up calling
machine__process_lost_event() again, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wus2zlhw3qo24ye84ewu4aqw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31 10:42:22 -03:00
Ingo Molnar
643cb15ba0 Merge tag 'perf-core-for-mingo-20160330' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes:

User visible changes:

  - Add support for skipping itrace instructions, useful to fast forward
    processor trace (Intel PT, BTS) to right after initialization code at the start
    of a workload (Andi Kleen)

  - Add support for backtraces in perl 'perf script's (Dima Kogan)

  - Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)

  - Make -f/--force option documentation consistent across tools (Jiri Olsa)

Infrastructure changes:

  - Add 'perf test' to check for event times (Jiri Olsa)

  - 'perf config' cleanups (Taeung Song)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31 08:33:43 +02:00
Anton Blanchard
9f56c092b9 perf jit: genelf makes assumptions about endian
Commit 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
incorrectly assumed that PowerPC is big endian only.

Simplify things by consolidating the define of GEN_ELF_ENDIAN and checking
for __BYTE_ORDER == __BIG_ENDIAN.

The PowerPC checks were also incorrect, they do not match what gcc
emits. We should first look for __powerpc64__, then __powerpc__.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Carl Love <cel@us.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Link: http://lkml.kernel.org/r/20160329175944.33a211cc@kryten
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 18:12:06 -03:00
Andres Freund
9098903555 perf hists: Fix determination of a callchain node's childlessness
The 4b3a321223 ("perf hists browser: Support flat callchains") commit
over-aggressively tried to optimize callchain_node__init_have_children().

That lead to --tui mode not allowing to expand call chain elements if a
call chain element had only one parent. That's why --inverted callgraphs
looked halfway sane, but plain ones didn't.

Revert that individual optimization, it wasn't really related to the
rest of the commit.

Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 4b3a321223 ("perf hists browser: Support flat callchains")
Link: http://lkml.kernel.org/r/20160330190245.GB13305@awork2.anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 18:08:39 -03:00
Andi Kleen
d1706b39f0 perf tools: Add support for skipping itrace instructions
When using 'perf script' to look at PT traces it is often useful to
ignore the initialization code at the beginning.

On larger traces which may have many millions of instructions in
initialization code doing that in a pipeline can be very slow, with perf
script spending a lot of CPU time calling printf and writing data.

This patch adds an extension to the --itrace argument that skips 'n'
events (instructions, branches or transactions) at the beginning. This
is much more efficient.

v2:
Add support for BTS (Adrian Hunter)
Document in itrace.txt
Fix branch check
Check transactions and instructions too

Committer note:

To test intel_pt one needs to make sure VT-x isn't active, i.e.
stopping KVM guests on the test machine, as described by Andi Kleen
at http://lkml.kernel.org/r/20160301234953.GD23621@tassilo.jf.intel.com

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1459187142-20035-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:09 -03:00
Dima Kogan
f7380c12ec perf script perl: Perl scripts now get a backtrace, like the python ones
We have some infrastructure to use perl or python to analyze logs
generated by perf.  Prior to this patch, only the python tools had
access to backtrace information.  This patch makes this information
available to perl scripts as well.  Example:

  Let's look at malloc() calls made by the seq utility.  First we
  create a probe point:

      $ perf probe -x /lib/x86_64-linux-gnu/libc.so.6 malloc
      Added new events:
      ...

  Now we run seq, while monitoring malloc() calls with perf

      $ perf record --call-graph=dwarf -e probe_libc:malloc seq 5
      1
      2
      3
      4
      5
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.064 MB perf.data (6 samples) ]

  We can use perf to look at its log to see the malloc calls and the backtrace

      $ perf script
      seq 14195 [000] 1927993.748254: probe_libc:malloc: (7f9ff8edd320) bytes=0x22
                  7f9ff8edd320 malloc (/lib/x86_64-linux-gnu/libc-2.22.so)
                  7f9ff8e8eab0 set_binding_values.part.0 (/lib/x86_64-linux-gnu/libc-2.22.so)
                  7f9ff8e8eda1 __bindtextdomain (/lib/x86_64-linux-gnu/libc-2.22.so)
                        401b22 main (/usr/bin/seq)
                  7f9ff8e82610 __libc_start_main (/lib/x86_64-linux-gnu/libc-2.22.so)
                        402799 _start (/usr/bin/seq)
      ...

  We can also use the scripting facilities.  We create a skeleton perl
  script that simply prints out the events

      $ perf script -g perl
      generated Perl script: perf-script.pl

  We can then use this script to see the malloc() calls with a
  backtrace.  Prior to this patch, the backtrace was not available to
  the perl scripts.

      $ perf script -s perf-script.pl
      probe_libc::malloc  0 1927993.748254260  14195 seq   __probe_ip=140325052863264, bytes=34
              [7f9ff8edd320] malloc
              [7f9ff8e8eab0] set_binding_values.part.0
              [7f9ff8e8eda1] __bindtextdomain
              [401b22] main
              [7f9ff8e82610] __libc_start_main
              [402799] _start
      ...

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/87mvphzld0.fsf@secretsauce.net
Signed-off-by: Dima Kogan <dima@secretsauce.net>
2016-03-30 11:14:09 -03:00
Taeung Song
37194f443a perf config: Rename 'v' to 'home' in set_buildid_dir()
Change the variable name 'v' to 'home' to make it more readable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1459099340-16911-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:09 -03:00
Taeung Song
9cb5987c82 perf config: Rework buildid_dir_command_config to perf_buildid_config
To avoid repeated calling perf_config() remove
buildid_dir_command_config() and add new perf_buildid_config into
perf_default_config.

Because perf_config() is already called with perf_default_config at
main().

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1459099340-16911-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:09 -03:00
Taeung Song
58cb9d650b perf config: Remove duplicated set_buildid_dir calls
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1459099340-16911-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:08 -03:00
Jiri Olsa
b31d660df3 perf tests: Add test to check for event times
This test creates software event 'cpu-clock' attaches it in several ways
and checks that enabled and running times match.

Committer notes:

Testing it:

  [acme@jouet linux]$ perf test -v times
  44: Test events times                                        :
  --- start ---
  test child forked, pid 27170
  attaching to spawned child, enable on exec
    OK    : ena 307328, run 307328
  attaching to current thread as enabled
    OK    : ena 7826, run 7826
  attaching to current thread as disabled
    OK    : ena 738, run 738
  attaching to CPU 0 as enabled
    SKIP  : not enough rights
  attaching to CPU 0 as enabled
    SKIP  : not enough rights
  test child finished with -2
  ---- end ----
  Test events times: Skip
  [acme@jouet linux]$

  [root@jouet ~]# perf test times
  44: Test events times                                        : Ok
  [root@jouet ~]# perf test -v times
  44: Test events times                                        :
  --- start ---
  test child forked, pid 27306
  attaching to spawned child, enable on exec
    OK    : ena 479290, run 479290
  attaching to current thread as enabled
    OK    : ena 11356, run 11356
  attaching to current thread as disabled
    OK    : ena 987, run 987
  attaching to CPU 0 as enabled
    OK    : ena 3717, run 3717
  attaching to CPU 0 as enabled
    OK    : ena 2323, run 2323
  test child finished with 0
  ---- end ----
  Test events times: Ok
  [root@jouet ~]#

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:08 -03:00
Jiri Olsa
e0be62cc03 perf tools: Make -f/--force option documentation consistent across tools
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:08 -03:00
Jiri Olsa
592dac6f35 perf tools: Make hists__collapse_insert_entry static
No need to export hists__collapse_insert_entry function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:07 -03:00
Jiri Olsa
ad16511b0e perf mem: Add -U/-K (--all-user/--all-kernel) options
Add -U/-K (--all-user/--all-kernel) options to use the perf record
--all-user/--all-kernel options.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-30 11:14:07 -03:00
Arnaldo Carvalho de Melo
3ea223adcb perf tools: Add missing initialization of perf_sample.cpumode in synthesized samples
In 473398a21d ("perf tools: Add cpumode to struct perf_sample"), I
missed some places where perf_sample fields are directly initialized in
addition to what is done in perf_evsel__parse_sample(), namely when
synthesizing PERF_RECORD_{MMAP*,COMM,FORK,EXIT} for pre-existing threads
and also in intel_pt and intel_bts when synthesizing events from
processor trace, the jitdump code also was affected, fix it.

The problem was noticed with running:

  # perf record -e intel_pt//u true
  # perf script

Where the samples wouldn't get resolved because perf_sample.cpumode
would be left as zero, i.e. PERF_RECORD_MISC_CPUMODE_UNKNOWN, not
resolving as kernel, hypervisor or user cpu modes.

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 473398a21d ("perf tools: Add cpumode to struct perf_sample")
Link: http://lkml.kernel.org/n/tip-n5sdauxgk24d5nun8kuuu2mh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-29 20:03:56 -03:00
Sukadev Bhattiprolu
379649cfea perf tools: Fix build break on powerpc
Commit 531d241063 ("perf tools: Do not include stringify.h from the
kernel sources") seems to have accidentially removed the inclusion of
"util/header.h" from "arch/powerpc/util/header.c".

"util/header.h" provides the prototype for get_cpuid() and is needed to
build perf on Powerpc:

	arch/powerpc/util/header.c:17:1: error: no previous prototype for 'get_cpuid' [-Werror=missing-prototypes]

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 531d241063 ("perf tools: Do not include stringify.h from the kernel sources")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[ Included "util.h" too, to get the scnprintf() prototype ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-28 17:46:20 -03:00
Linus Torvalds
3fa2fe2ce0 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This tree contains various perf fixes on the kernel side, plus three
  hw/event-enablement late additions:

   - Intel Memory Bandwidth Monitoring events and handling
   - the AMD Accumulated Power Mechanism reporting facility
   - more IOMMU events

  ... and a final round of perf tooling updates/fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
  perf llvm: Use strerror_r instead of the thread unsafe strerror one
  perf llvm: Use realpath to canonicalize paths
  perf tools: Unexport some methods unused outside strbuf.c
  perf probe: No need to use formatting strbuf method
  perf help: Use asprintf instead of adhoc equivalents
  perf tools: Remove unused perf_pathdup, xstrdup functions
  perf tools: Do not include stringify.h from the kernel sources
  tools include: Copy linux/stringify.h from the kernel
  tools lib traceevent: Remove redundant CPU output
  perf tools: Remove needless 'extern' from function prototypes
  perf tools: Simplify die() mechanism
  perf tools: Remove unused DIE_IF macro
  perf script: Remove lots of unused arguments
  perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve
  perf machine: Rename perf_event__preprocess_sample to machine__resolve
  perf tools: Add cpumode to struct perf_sample
  perf tests: Forward the perf_sample in the dwarf unwind test
  perf tools: Remove misplaced __maybe_unused
  perf list: Fix documentation of :ppp
  perf bench numa: Fix assertion for nodes bitfield
  ...
2016-03-24 10:02:14 -07:00
Arnaldo Carvalho de Melo
6a1a77baa7 perf bench: Fix detached tarball building due to missing 'perf bench memcpy' headers
A change on kernel files included by the 'perf bench memcpy' code grew some new
include deps, breaking the detached tarball build:

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/linux/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
  tests/make:302: recipe for target 'tarpkg' failed
  make[1]: *** [tarpkg] Error 2
  Makefile:102: recipe for target 'build-test' failed
  make: *** [build-test] Error 2
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  $ cat tools/perf/tarpkg
  ./tests/perf-targz-src-pkg .
    PERF_VERSION = 4.5.g05f5ec
    PERF_VERSION = 4.5.g05f5ec
  In file included from bench/mem-memcpy-x86-64-asm.S:9:0:
  bench/../../../arch/x86/lib/memcpy_64.S:5:29: fatal error: asm/cpufeatures.h: No such file or directory
  compilation terminated.
  mv: cannot stat ‘bench/.mem-memcpy-x86-64-asm.o.tmp’: No such file or directory
  make[5]: *** [bench/mem-memcpy-x86-64-asm.o] Error 1
  make[5]: *** Waiting for unfinished jobs....
  make[4]: *** [bench] Error 2
  make[4]: *** Waiting for unfinished jobs....
  make[3]: *** [perf-in.o] Error 2
  make[3]: *** Waiting for unfinished jobs....
  make[2]: *** [all] Error 2
  $

Add arch/*/include/asm/*features.h to tools/perf/MANIFEST so that we can
continue to use detached tarballs to build perf.

Now it builds ok, doing it manually:

  $ make help | grep perf
    perf-tar-src-pkg    - Build perf-4.5.0.tar source tarball
    perf-targz-src-pkg  - Build perf-4.5.0.tar.gz source tarball
    perf-tarbz2-src-pkg - Build perf-4.5.0.tar.bz2 source tarball
    perf-tarxz-src-pkg  - Build perf-4.5.0.tar.xz source tarball
  $ ls -la perf-4.5.0.tar
  ls: cannot access perf-4.5.0.tar: No such file or directory
  $ make perf-tar-src-pkg
    TAR
    PERF_VERSION = 4.5.g32c25b
  $ ls -la perf-4.5.0.tar
  -rw-rw-r--. 1 acme acme 6318080 Mar 24 11:52 perf-4.5.0.tar
  $ mv perf-4.5.0.tar /tmp
  $ cd /tmp
  $ tar xf perf-4.5.0.tar
  $ cd perf-4.5.0/tools/perf
  $ make > /dev/null
  PERF_VERSION = 4.5.g32c25b
  $ ls -la perf
  -rwxrwxr-x. 1 acme acme 14046416 Mar 24 11:53 perf
  $ ./perf --version
  perf version 4.5.g32c25b
  $ perf bench
  Usage:
	perf bench [<common options>] <collection> <benchmark> [<options>]

        # List of all available benchmark collections:

         sched: Scheduler and IPC benchmarks
           mem: Memory access benchmarks
          numa: NUMA scheduling and MM benchmarks
         futex: Futex stressing benchmarks
           all: All benchmarks

  $ perf bench mem

        # List of available benchmarks for collection 'mem':

        memcpy: Benchmark for memcpy() functions
        memset: Benchmark for memset() functions
           all: Run all memory access benchmarks

  $ perf bench mem memcpy
  # Running 'mem/memcpy' benchmark:
  # function 'default' (Default memcpy() provided by glibc)
  # Copying 1MB bytes ...

        15.024038 GB/sec
  # function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
  # Copying 1MB bytes ...

        17.438616 GB/sec
  # function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
  # Copying 1MB bytes ...

        25.040064 GB/sec
  # function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
  # Copying 1MB bytes ...

        25.040064 GB/sec
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2c2sncwffuabw58fj1pw86gu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-24 12:28:57 -03:00
Arnaldo Carvalho de Melo
cde88355f2 perf tests: Fix tarpkg build test error output redirection
So we need to trow away just stdout, leaving stderr to be caught by
the build tests infrastructure, so that we can see what went wrong
when the tarpkg build test fails:

  $ make -C tools/perf build-test
  make: Entering directory '/home/acme/git/linux/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
  tests/make:302: recipe for target 'tarpkg' failed
  make[1]: *** [tarpkg] Error 2
  Makefile:102: recipe for target 'build-test' failed
  make: *** [build-test] Error 2
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  $ cat tools/perf/tarpkg
  ./tests/perf-targz-src-pkg .
    PERF_VERSION = 4.5.g05f5ec
    PERF_VERSION = 4.5.g05f5ec
  In file included from bench/mem-memcpy-x86-64-asm.S:9:0:
  bench/../../../arch/x86/lib/memcpy_64.S:5:29: fatal error: asm/cpufeatures.h: No such file or directory
  compilation terminated.
  mv: cannot stat ‘bench/.mem-memcpy-x86-64-asm.o.tmp’: No such file or directory
  make[5]: *** [bench/mem-memcpy-x86-64-asm.o] Error 1
  make[5]: *** Waiting for unfinished jobs....
  make[4]: *** [bench] Error 2
  make[4]: *** Waiting for unfinished jobs....
  make[3]: *** [perf-in.o] Error 2
  make[3]: *** Waiting for unfinished jobs....
  make[2]: *** [all] Error 2
  $

So the test flow is:

1. Run: 'make -C tools/perf build-test'

2. One of its tests failed, in this case, the 'tarpkg' one

3. Look at what went wrong, by looking at the output of that test, in
   tools/perf/tarpkg

Admittedly, this should be shortcircuited to showing what went wrong directly
from the 'make build-test' step, but lets first fix this tarpkg one and the
problem it spotted, which should be fixed by adding some extra file to the
tools/perf/MANIFEST so that detached tarballs continue being self contained and
build successfully.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ynld6egoxolmftcddpnd7oh6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-24 12:26:41 -03:00
Arnaldo Carvalho de Melo
76267147f2 perf llvm: Use strerror_r instead of the thread unsafe strerror one
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5njrq9dltckgm624omw9ljgu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 17:42:21 -03:00
Arnaldo Carvalho de Melo
78478269d2 perf llvm: Use realpath to canonicalize paths
To kill the last user of make_nonrelative_path(), that gets ditched,
one more panicking function killed.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3hu56rvyh4q5gxogovb6ko8a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 17:39:19 -03:00
Arnaldo Carvalho de Melo
0741208a7c perf tools: Unexport some methods unused outside strbuf.c
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-nq1wvtky4mpu0nupjyar7sbw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 17:09:53 -03:00
Arnaldo Carvalho de Melo
88fd633cdf perf probe: No need to use formatting strbuf method
We have addch() for chars, add() for fixed size data, and addstr() for
variable length strings, use them.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ap02fn2xtvpduj2j6b2o1j4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 16:53:05 -03:00
Arnaldo Carvalho de Melo
a610f5cbb2 perf help: Use asprintf instead of adhoc equivalents
That doesn't chekcs malloc return and that, when using strbuf, if it
can't grow, just explodes away via die().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vr8qsjbwub7e892hpa9msz95@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 16:36:07 -03:00
Arnaldo Carvalho de Melo
cf47a8aede perf tools: Remove unused perf_pathdup, xstrdup functions
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-s87zi5d03m6rz622y1z6rlsa@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 15:27:33 -03:00
Arnaldo Carvalho de Melo
531d241063 perf tools: Do not include stringify.h from the kernel sources
Use instead the copy just made to tools/include/linux/.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q736w12nwy98x5ox2hamp5ow@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 15:21:15 -03:00
Arnaldo Carvalho de Melo
3938bad44e perf tools: Remove needless 'extern' from function prototypes
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-w246stf7ponfamclsai6b9zo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 15:06:35 -03:00
Arnaldo Carvalho de Melo
e476343860 perf tools: Simplify die() mechanism
This should die altogether, but for now lets remove a bit of this stuff,
as it is not used at all.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ade3n99xscldhg5mx2vzd8p3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:32:31 -03:00
Arnaldo Carvalho de Melo
236f71eb94 perf tools: Remove unused DIE_IF macro
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-elxg25jd4dhwod4wqbko87qh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:30:42 -03:00
Arnaldo Carvalho de Melo
a3dff304ca perf script: Remove lots of unused arguments
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:11:03 -03:00
Arnaldo Carvalho de Melo
c2740a87ca perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve
Since none of the perf_event fields are used anymore, just the
perf_sample ones, and since this resolves to (map, symbol) from data
structures within struct thread, rename it to thread__resolve and make
the argument ordering similar to the one in machine__resolve().

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2b33hs9bp550tezzlhl4kejh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:08 -03:00
Arnaldo Carvalho de Melo
bb3eb56622 perf machine: Rename perf_event__preprocess_sample to machine__resolve
Since we only deal with fields in the passed struct perf_sample move
this method to struct machine, that is where the perf_sample fields
will be resolved to a struct addr_location, i.e. thread, map, symbol,
etc.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-a1ww2lbm2vbuqsv4p7ilubu9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:08 -03:00
Arnaldo Carvalho de Melo
473398a21d perf tools: Add cpumode to struct perf_sample
To avoid parsing event->header.misc in many locations.

This will also allow setting perf.sample.{ip,cpumode} in a single place,
from tracepoint fields, as needed by 'perf kvm' with PPC guests, where
the guest hardware counters is not available at the host.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qp3yradhyt6q3wl895b1aat0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:07 -03:00
Arnaldo Carvalho de Melo
eb9f03231b perf tests: Forward the perf_sample in the dwarf unwind test
It _will_ be used, no sense in receiving it and nor fowarding it along.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ht8v5et209wuoh5o6nh9pzyq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:07 -03:00
Arnaldo Carvalho de Melo
b8f8eb84f4 perf tools: Remove misplaced __maybe_unused
All over the tree.

Cc: David Ahern <dsahern@gmail.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-8nzhnokxyp8y4v7gf0j00oyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-23 12:03:04 -03:00
Andi Kleen
4ca0d8193f perf list: Fix documentation of :ppp
Correctly document what is implemented for :ppp on Intel CPUs in recent
kernels.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1458575793-12091-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-22 10:01:45 -03:00
Jakub Jelen
3c52b658b8 perf bench numa: Fix assertion for nodes bitfield
Comparing bits and bytes in numa benchmark assertion

I hit the issue on two socket Power8 machine presenting its numa nodes
as 0,1,16,17 (according to numactl). Therefore I got error (and hang of
parent process):

    perf: bench/numa.c:296: bind_to_memnode: Assertion `!(g->p.nr_nodes > (int)sizeof(nodemask))' failed.

This is obviously false positive. We can fit all the 18 nodes into
bitfield of 8 bytes (long on 64b architecture).

Signed-off-by: Jakub Jelen <jakuje@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jakub Jelen <jjelen@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: trivial@kernel.org
Link: http://lkml.kernel.org/r/1458388687-24421-1-git-send-email-jakuje@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-21 17:46:57 -03:00
Linus Torvalds
26660a4046 Merge branch 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull 'objtool' stack frame validation from Ingo Molnar:
 "This tree adds a new kernel build-time object file validation feature
  (ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation.
  It was written by and is maintained by Josh Poimboeuf.

  The motivation: there's a category of hard to find kernel bugs, most
  of them in assembly code (but also occasionally in C code), that
  degrades the quality of kernel stack dumps/backtraces.  These bugs are
  hard to detect at the source code level.  Such bugs result in
  incorrect/incomplete backtraces most of time - but can also in some
  rare cases result in crashes or other undefined behavior.

  The build time correctness checking is done via the new 'objtool'
  user-space utility that was written for this purpose and which is
  hosted in the kernel repository in tools/objtool/.  The tool's (very
  simple) UI and source code design is shaped after Git and perf and
  shares quite a bit of infrastructure with tools/perf (which tooling
  infrastructure sharing effort got merged via perf and is already
  upstream).  Objtool follows the well-known kernel coding style.

  Objtool does not try to check .c or .S files, it instead analyzes the
  resulting .o generated machine code from first principles: it decodes
  the instruction stream and interprets it.  (Right now objtool supports
  the x86-64 architecture.)

  From tools/objtool/Documentation/stack-validation.txt:

   "The kernel CONFIG_STACK_VALIDATION option enables a host tool named
    objtool which runs at compile time.  It has a "check" subcommand
    which analyzes every .o file and ensures the validity of its stack
    metadata.  It enforces a set of rules on asm code and C inline
    assembly code so that stack traces can be reliable.

    Currently it only checks frame pointer usage, but there are plans to
    add CFI validation for C files and CFI generation for asm files.

    For each function, it recursively follows all possible code paths
    and validates the correct frame pointer state at each instruction.

    It also follows code paths involving special sections, like
    .altinstructions, __jump_table, and __ex_table, which can add
    alternative execution paths to a given instruction (or set of
    instructions).  Similarly, it knows how to follow switch statements,
    for which gcc sometimes uses jump tables."

  When this new kernel option is enabled (it's disabled by default), the
  tool, if it finds any suspicious assembly code pattern, outputs
  warnings in compiler warning format:

    warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
    warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
    warning: objtool:__schedule()+0x3c0: duplicate frame pointer save
    warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer

  ... so that scripts that pick up compiler warnings will notice them.
  All known warnings triggered by the tool are fixed by the tree, most
  of the commits in fact prepare the kernel to be warning-free.  Most of
  them are bugfixes or cleanups that stand on their own, but there are
  also some annotations of 'special' stack frames for justified cases
  such entries to JIT-ed code (BPF) or really special boot time code.

  There are two other long-term motivations behind this tool as well:

   - To improve the quality and reliability of kernel stack frames, so
     that they can be used for optimized live patching.

   - To create independent infrastructure to check the correctness of
     CFI stack frames at build time.  CFI debuginfo is notoriously
     unreliable and we cannot use it in the kernel as-is without extra
     checking done both on the kernel side and on the build side.

  The quality of kernel stack frames matters to debuggability as well,
  so IMO we can merge this without having to consider the live patching
  or CFI debuginfo angle"

* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  objtool: Only print one warning per function
  objtool: Add several performance improvements
  tools: Copy hashtable.h into tools directory
  objtool: Fix false positive warnings for functions with multiple switch statements
  objtool: Rename some variables and functions
  objtool: Remove superflous INIT_LIST_HEAD
  objtool: Add helper macros for traversing instructions
  objtool: Fix false positive warnings related to sibling calls
  objtool: Compile with debugging symbols
  objtool: Detect infinite recursion
  objtool: Prevent infinite recursion in noreturn detection
  objtool: Detect and warn if libelf is missing and don't break the build
  tools: Support relative directory path for 'O='
  objtool: Support CROSS_COMPILE
  x86/asm/decoder: Use explicitly signed chars
  objtool: Enable stack metadata validation on 64-bit x86
  objtool: Add CONFIG_STACK_VALIDATION option
  objtool: Add tool to perform compile-time stack metadata validation
  x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard
  sched: Always inline context_switch()
  ...
2016-03-20 18:23:21 -07:00
Wang Nan
73cdf0c6ea perf symbols: Record text offset in dso to calculate objdump address
Store DSO's .text offset into DSO, used for VDSOs and will also be used for
other needs, like handling kernel modules.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-2-git-send-email-wangnan0@huawei.com
[ Extracted from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-18 14:23:59 -03:00
Arnaldo Carvalho de Melo
ca70c24fb1 tools: Move utilities.mak from perf to tools/scripts/
As it is used by several other tools, better move it outside tools/perf.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-34s9kue3xq9w5mijdmfrfx8s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-18 13:57:20 -03:00
Vlastimil Babka
420adbe9fc mm, tracing: unify mm flags handling in tracepoints and printk
In tracepoints, it's possible to print gfp flags in a human-friendly
format through a macro show_gfp_flags(), which defines a translation
array and passes is to __print_flags().  Since the following patch will
introduce support for gfp flags printing in printk(), it would be nice
to reuse the array.  This is not straightforward, since __print_flags()
can't simply reference an array defined in a .c file such as mm/debug.c
- it has to be a macro to allow the macro magic to communicate the
format to userspace tools such as trace-cmd.

The solution is to create a macro __def_gfpflag_names which is used both
in show_gfp_flags(), and to define the gfpflag_names[] array in
mm/debug.c.

On the other hand, mm/debug.c also defines translation tables for page
flags and vma flags, and desire was expressed (but not implemented in
this series) to use these also from tracepoints.  Thus, this patch also
renames the events/gfpflags.h file to events/mmflags.h and moves the
table definitions there, using the same macro approach as for gfpflags.
This allows translating all three kinds of mm-specific flags both in
tracepoints and printk.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Vlastimil Babka
14e0a214d6 tools, perf: make gfp_compact_table up to date
When updating tracing's show_gfp_flags() I have noticed that perf's
gfp_compact_table is also outdated.  Fill in the missing flags and place
a note in gfp.h to increase chance that future updates are synced.
Convert the __GFP_X flags from "GFP_X" to "__GFP_X" strings in line with
the previous patch.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15 16:55:16 -07:00
Sukadev Bhattiprolu
4c9d6c18fd perf test: Remove 'core_id' check in topo test
The topology test case of 'perf test' seems to be broken on my x86
system - due to the comparison of a "core-id" with # of CPUs online.

There are 8 online CPUs:

	$ cat /sys/devices/system/cpu/online
	0-7

but core-ids are not sequential and some core-ids exceed the number
of online CPUs.

	$ cat /sys/devices/system/cpu/cpu?/topology/core_id
	0
	1
	9
	10
	0
	1
	9
	10

Looks like we can safely remove the check.  Output before:

	$ perf --version
	perf version 4.4.rc1.g34258a

	$ perf test -v topo
	36: Test topology in session                                 :
	--- start ---
	test child forked, pid 5906
	templ file: /tmp/perf-test-vCwWG3
	core_id number is too big.You may need to upgrade the perf tool.
	test child interrupted
	---- end ----
	Test topology in session: FAILED!

and after:

	$ perf test -v topo
	36: Test topology in session                                 :
	--- start ---
	test child forked, pid 6532
	templ file: /tmp/perf-test-y10wFJ
	CPU 0, core 0, socket 0
	CPU 1, core 1, socket 0
	CPU 2, core 9, socket 0
	CPU 3, core 10, socket 0
	CPU 4, core 0, socket 1
	CPU 5, core 1, socket 1
	CPU 6, core 9, socket 1
	CPU 7, core 10, socket 1
	test child finished with 0
	---- end ----
	Test topology in session: Ok

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20151203233219.GA27696@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-11 13:45:04 -03:00
Andi Kleen
206cab651d perf stat: Add --metric-only support for -A
Add metric only support for -A too. This requires a new print function
that prints the metrics in the right order.

v2: Fix manpage
v3: Simplify nrcpus computation

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1457049458-28956-7-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:50:47 -03:00
Andi Kleen
54b5091606 perf stat: Implement --metric-only mode
Add a new mode to only print metrics. Sometimes we don't care about the
raw values, just want the computed metrics. This allows more compact
printing, so with -I each sample is only a single line.  This also
allows easier plotting and processing with other tools.

The main target is with using --topdown, but it also works with -T and
standard perf stat. A few metrics are not supported.

To avoiding having to hardcode all the metrics in the code it uses a two
pass approach: first compute dummy metrics and only print the headers in
the print_metric callback. Then use the callback to print the actual
values.

There are some additional changes in the stat printout code to handle
all metrics being on a single line.

One issue is that the column code doesn't know in advance what events
are not supported by the CPU, and it would be hard to find out as this
could change based on dynamic conditions. That causes empty columns in
some cases.

The output can be fairly wide, often you may need more than 80 columns.

Example:

% perf stat -a -I 1000 --metric-only
     1.001452803 frontend cycles idle insn per cycle       stalled cycles per insn branch-misses of all branches
     1.001452803  158.91%               0.66                2.39                    2.92%
     2.002192321  180.63%               0.76                2.08                    2.96%
     3.003088282  150.59%               0.62                2.57                    2.84%
     4.004369835  196.20%               0.98                1.62                    3.79%
     5.005227314  231.98%               0.84                1.90                    4.71%

v2: Lots of updates.
v3: Use slightly narrower columns
v4: Add comment

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1457049458-28956-6-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:49:40 -03:00
Andi Kleen
6b45f7b2a3 perf stat: Document CSV format in manpage
With all the recently added fields in the perf stat CSV output we should
finally document them in the man page. Do this here.

v2: Fix fields in documentation (Jiri)
v3: fix order of fields again (Jiri)
v4: Change order again.
v5: Document more fields (Jiri)
v6: Move time stamp first
v7: More fixes (Jiri)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1457049458-28956-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:49:06 -03:00
Namhyung Kim
599a2f38a9 perf hists browser: Check sort keys before hot key actions
The context menu in TUI hists browser checks corresponding sort keys
when creating the menu item.  But hotkey actions lacks these checks so
it can filter using incorrect info.

For example, default sort key of 'perf top' doesn't contain 'comm' or
'pid' sort key so each hist entry's thread info is not reliable.  Thus
it should prohibit using thread filter on 't' key.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1457533253-21419-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:48:02 -03:00
Namhyung Kim
6962ccb37b perf hists browser: Allow thread filtering for comm sort key
The commit 2eafd410e6 ("perf hists browser: Only 'Zoom into thread'
only when sort order has 'pid'") disabled thread filtering in hist
browser for the default sort key.  However the he->thread is still valid
even if 'pid' sort key is not given.  Only thing it should not use is
the pid (or tid) of the thread.  So allow to filter by thread when
'comm' sort key is given and show pid only if 'pid' sort key is given.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1457536490-24084-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:47:37 -03:00
Namhyung Kim
078b8d4a40 perf tools: Add sort__has_comm variable
The sort__has_comm variable is to check whether the comm sort key is
given.  This is necessary to support thread filtering in the TUI hists
browser later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1457533253-21419-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-10 16:47:19 -03:00