Commit Graph

15026 Commits

Author SHA1 Message Date
Anders Roxell
bdefe01a6b selftests/memfd: add run_fuse_test.sh to TEST_FILES
While testing memfd tests, there is a missing script, as reported by
kselftest:

  ./run_tests.sh: line 7: ./run_fuse_test.sh: No such file or directory

Link: http://lkml.kernel.org/r/1517955779-11386-1-git-send-email-daniel.diaz@linaro.org
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21 15:35:43 -08:00
Martin Kelly
7ed1c1901f tools: fix cross-compile var clobbering
Currently a number of Makefiles break when used with toolchains that
pass extra flags in CC and other cross-compile related variables (such
as --sysroot).

Thus we get this error when we use a toolchain that puts --sysroot in
the CC var:

  ~/src/linux/tools$ make iio
  [snip]
  iio_event_monitor.c:18:10: fatal error: unistd.h: No such file or directory
    #include <unistd.h>
             ^~~~~~~~~~

This occurs because we clobber several env vars related to
cross-compiling with lines like this:

  CC = $(CROSS_COMPILE)gcc

Although this will point to a valid cross-compiler, we lose any extra
flags that might exist in the CC variable, which can break toolchains
that rely on them (for example, those that use --sysroot).

This easily shows up using a Yocto SDK:

  $ . [snip]/sdk/environment-setup-cortexa8hf-neon-poky-linux-gnueabi

  $ echo $CC
  arm-poky-linux-gnueabi-gcc -march=armv7-a -mfpu=neon -mfloat-abi=hard
  -mcpu=cortex-a8
  --sysroot=[snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi

  $ echo $CROSS_COMPILE
  arm-poky-linux-gnueabi-

  $ echo ${CROSS_COMPILE}gcc
  krm-poky-linux-gnueabi-gcc

Although arm-poky-linux-gnueabi-gcc is a cross-compiler, we've lost the
--sysroot and other flags that enable us to find the right libraries to
link against, so we can't find unistd.h and other libraries and headers.
Normally with the --sysroot flag we would find unistd.h in the sdk
directory in the sysroot:

  $ find [snip]/sdk/sysroots -path '*/usr/include/unistd.h'
  [snip]/sdk/sysroots/cortexa8hf-neon-poky-linux-gnueabi/usr/include/unistd.h

The perf Makefile adds CC = $(CROSS_COMPILE)gcc if and only if CC is not
already set, and it compiles correctly with the above toolchain.

So, generalize the logic that perf uses in the common Makefile and
remove the manual CC = $(CROSS_COMPILE)gcc lines from each Makefile.

Note that this patch does not fix cross-compile for all the tools (some
have other bugs), but it does fix it for all except usb and acpi, which
still have other unrelated issues.

I tested both with and without the patch on native and cross-build and
there appear to be no regressions.

Link: http://lkml.kernel.org/r/20180107214028.23771-1-martin@martingkelly.com
Signed-off-by: Martin Kelly <martin@martingkelly.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Jonathan Cameron <jic23@kernel.org>
Cc: Pali Rohar <pali.rohar@gmail.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Robert Moore <robert.moore@intel.com>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Valentina Manea <valentina.manea.m@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-21 15:35:42 -08:00
Todd E Brandt
700abc90f0 pm-graph: AnalyzeSuspend v5.0
- add -cgskip option to reduce callgraph output size
- add -cgfilter option to focus on a list of devices
- add -result option for exporting batch test results
- removed all phoronix hooks, use -result to enable batch testing
- change -usbtopo to -devinfo, now prints all devices
- add -gzip option to read/write logs in gz format
- add -bufsize option to manually control ftrace buffer size
- add -sync option to run filesystem sync prior to test
- add -display option to enable/disable the display prior to test
- add -rs option to enable/disable runtime suspend on all devices for test
- add installed config files to search path
- add kernel error/warning links into the timeline
- fix callgraph trace to better handle interrupts
- include command string and kernel params in timeline output header

Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-21 23:56:22 +01:00
Todd E Brandt
d83a76a8ec pm-graph: AnalyzeBoot v2.2
- add -cgskip option to reduce callgraph output size
- add -cgfilter option to focus on a list of devices
- add -result option for exporting batch test results
- removed all phoronix hooks, use -result to enable batch testing
- changed argument -f to match sleegraph, -f = -callgraph
- use -fstat for function status instead of -f
- add -verbose option to print out timeline stats and kernel options
- include command string and kernel params in timeline output header

Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-21 23:56:22 +01:00
Todd E Brandt
a6fbdbb2b8 pm-graph: config files and installer
- name change: analyze_boot.py to bootgraph.py
- name change: analyze_suspend.py to sleepgraph.py
- added config files for easier sleepgraph usage
- added example.cfg which describes all config options
- added cgskip.txt definition for slimmer callgraphs

Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-02-21 23:56:22 +01:00
David S. Miller
bf006d18b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2018-02-20

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a memory leak in LPM trie's map_free() callback function, where
   the trie structure itself was not freed since initial implementation.
   Also a synchronize_rcu() was needed in order to wait for outstanding
   programs accessing the trie to complete, from Yonghong.

2) Fix sock_map_alloc()'s error path in order to correctly propagate
   the -EINVAL error in case of too large allocation requests. This
   was just recently introduced when fixing close hooks via ULP layer,
   fix from Eric.

3) Do not use GFP_ATOMIC in __cpu_map_entry_alloc(). Reason is that this
   will not work with the recent __ptr_ring_init_queue_alloc() conversion
   to kvmalloc_array(), where in case of fallback to vmalloc() that GFP
   flag is invalid, from Jason.

4) Fix two recent syzkaller warnings: i) fix bpf_prog_array_copy_to_user()
   when a prog query with a big number of ids was performed where we'd
   otherwise trigger a warning from allocator side, ii) fix a missing
   mlock precharge on arraymaps, from Daniel.

5) Two fixes for bpftool in order to avoid breaking JSON output when used
   in batch mode, from Quentin.

6) Move a pr_debug() in libbpf in order to avoid having an otherwise
   uninitialized variable in bpf_program__reloc_text(), from Jeremy.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-21 15:37:37 -05:00
Andi Kleen
42811d509d perf stat: Use xyarray dimensions to iterate fds
Now that the xyarray stores the dimensions we can use those
to iterate over the FDs for a evsel.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171006020029.13339-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-21 11:36:57 -03:00
Sangwon Hong
de71128688 perf kallsyms: Fix the usage on the man page
First, all man pages highlight only perf and subcommands except 'perf
kallsyms', which includes the full usage. Fix it for commands to
monopolize underlines.

Second, options can be ommited when executing 'perf kallsyms', so add
square brackets between <option>.

Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1518377864-20353-1-git-send-email-qpakzk@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-21 09:23:36 -03:00
Alan Stern
bf28ae5627 tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference
Since commit 76ebbe78f7 ("locking/barriers: Add implicit
smp_read_barrier_depends() to READ_ONCE()") was merged for the 4.15
kernel, it has not been necessary to use smp_read_barrier_depends().
Similarly, commit 59ecbbe7b3 ("locking/barriers: Kill
lockless_dereference()") removed lockless_dereference() from the
kernel.

Since these primitives are no longer part of the kernel, they do not
belong in the Linux Kernel Memory Consistency Model.  This patch
removes them, along with the internal rb-dep relation, and updates the
revelant documentation.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-12-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:16 +01:00
Paul E. McKenney
cac79a39f2 tools/memory-model: Convert underscores to hyphens
Typical cat-language code uses hyphens for word separators in
identifiers, but several LKMM identifiers use underscores instead.
This commit therefore converts underscores to hyphens in the .bell-
and .cat-file identifiers corresponding to smp_mb__before_atomic(),
smp_mb__after_atomic(), and smp_mb__after_spinlock().

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: stern@rowland.harvard.edu
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-11-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:15 +01:00
Alan Stern
556bb7d252 tools/memory-model: Add a S lock-based external-view litmus test
This commit adds a litmus test in which P0() and P1() form a lock-based S
litmus test, with the addition of P2(), which observes P0()'s and P1()'s
accesses with a full memory barrier but without the lock.  This litmus
test asks whether writes carried out by two different processes under the
same lock will be seen in order by a third process not holding that lock.
The answer to this question is "yes" for all architectures supporting
the Linux kernel, but is "no" according to the current version of LKMM.

A patch to LKMM is under development.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-10-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:15 +01:00
Paul E. McKenney
8f7f2fbd00 tools/memory-model: Add required herd7 version to README file
LKMM and the herd7 tool are co-evolving, and out-of-date herd7 tools
produce inaccurate results, often with no obvious error messages.  This
commit therefore adds the required herd7 version to the LKMM README file.

Longer term, it would be good if .cat files could specify the required
version in a manner allowing herd7 to produce clear diagnostics.

Suggested-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: stern@rowland.harvard.edu
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-9-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:15 +01:00
Paul E. McKenney
6215514704 README: Fix a couple of punctuation errors
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: stern@rowland.harvard.edu
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-5-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:14 +01:00
Paul E. McKenney
8f32543b61 EXP litmus_tests: Add comments explaining tests' purposes
This commit adds comments to the litmus tests summarizing what these
tests are intended to demonstrate.

[ paulmck: Apply Andrea's and Alan's feedback. ]
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Cc: parri.andrea@gmail.com
Cc: stern@rowland.harvard.edu
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1519169112-20593-4-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:13 +01:00
Andrea Parri
e7d74c9f90 MAINTAINERS: Add the Memory Consistency Model subsystem
Move the contents of tools/memory-model/MAINTAINERS into the main
MAINTAINERS file, removing tools/memory-model/MAINTAINERS. This
allows get_maintainer.pl to correctly identify the maintainers of
tools/memory-model/.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Link: http://lkml.kernel.org/r/1519169112-20593-2-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:12 +01:00
Andrea Parri
48d44d4e8a tools/memory-model: Clarify the origin/scope of the tool name
Ingo pointed out that:

  "The "memory model" name is overly generic, ambiguous and somewhat
   misleading, as we usually mean the virtual memory layout/model
   when we say "memory model". GCC too uses it in that sense [...]"

Make it clear that tools/memory-model/ uses the term "memory model" as
shorthand for "memory consistency model" by calling out this convention
in tools/memory-model/README.

Stick to the original "memory model" term in sources' headers and for
the subsystem name.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akiyks@gmail.com
Cc: boqun.feng@gmail.com
Cc: dhowells@redhat.com
Cc: j.alglave@ucl.ac.uk
Cc: linux-arch@vger.kernel.org
Cc: luc.maranget@inria.fr
Cc: nborisov@suse.com
Cc: npiggin@gmail.com
Link: http://lkml.kernel.org/r/1519169112-20593-1-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:58:12 +01:00
Ingo Molnar
862e6e2a60 Merge tag 'v4.16-rc2' into locking/core, to refresh the branch
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:57:55 +01:00
Peter Zijlstra
ca41b97ed9 objtool: Add module specific retpoline rules
David allowed retpolines in .init.text, except for modules, which will
trip up objtool retpoline validation, fix that.

Requested-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:05:05 +01:00
Peter Zijlstra
b5bc2231b8 objtool: Add retpoline validation
David requested a objtool validation pass for CONFIG_RETPOLINE=y enabled
builds, where it validates no unannotated indirect  jumps or calls are
left.

Add an additional .discard.retpoline_safe section to allow annotating
the few indirect sites that are required and safe.

Requested-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:05:04 +01:00
Peter Zijlstra
43a4525f80 objtool: Use existing global variables for options
Use the existing global variables instead of passing them around and
creating duplicate global variables.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-21 09:05:04 +01:00
Paul E. McKenney
85ba6bfe8b torture: Provide more sensible nreader/nwriter defaults for rcuperf
The default values for nreader and nwriter are apparently not all that
user-friendly, resulting in people doing scalability tests that ran all
runs at large scale.  This commit therefore makes both the nreaders and
nwriters module default to the number of CPUs, and adds a comment to
rcuperf.c stating that the number of CPUs should be specified using the
nr_cpus kernel boot parameter.  This commit also eliminates the redundant
rcuperf scripting specification of default values for these parameters.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:22:01 -08:00
Paul E. McKenney
0da8c08d71 torture: Grace periods do not piggyback off of themselves
The rcuperf trace-event processing counted every "done" trace event
as a piggyback, which is incorrect because the task that started the
grace period didn't piggyback at all.  This commit fixes this problem
by recording the task that started a given grace period and ignoring
that task's "done" record for that grace period.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:22:01 -08:00
Paul E. McKenney
cc839ce55d torture: Adjust rcuperf trace processing to allow for workqueues
The rcuperf event-trace processing assumes that expedited grace periods
start and end on the same task, an assumption that was violated by moving
expedited grace-period processing to workqueues.  This commit removes
this now-fallacious assumption from rcuperf's event-trace processing.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:22:00 -08:00
Paul E. McKenney
adcfe76c61 torture: Default jitter off when running rcuperf
The purpose of jitter is to expose concurrency bugs due to invalid
assumptions about forward progress.  There is usually little point
in jitter when measuring performance.  This commit therefore defaults
jitter off when running rcuperf.  You can override this by specifying
the kvm.sh "--jitter" argument -after- the "--torture rcuperf"
argument.  No idea why you would want this, but if you do, that is
how you do it.

One example of a conccurrency bug that this jitter might expose is one
in which the developer assumed that a given short region of code would be
guaranteed to execute within some short time limit.  Such assumptions are
invalid in virtualized environments because the hupervisor can preempt
the guest OS at any point, even when the guest OS thinks that it has
disabled interrupts.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:22:00 -08:00
Paul E. McKenney
642146b1ad torture: Specify qemu memory size with --memory argument
The 512 megabyte memory size has served quite well, but more memory
is required when using large trace buffers on large systems.  This
commit therefore adds a --memory argument to the kvm.sh script, which
allows the memory size to be specified on the command line, for example,
"--memory 768", --memory 800M", or "--memory 2G".

Reported-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:21:59 -08:00
Lihao Liang
67ad71ddbc rcutorture: Add basic ARM64 support to run scripts
This commit adds support of the qemu command qemu-system-aarch64
to rcutorture.

Signed-off-by: Lihao Liang <lianglihao@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:21:59 -08:00
Paul E. McKenney
db92ca3ab8 rcutorture: Update kvm.sh header comment
The kvm.sh header comment is a bit of a relic, so this commit brings
it up to date.

Reported-by: Lihao Liang <lianglihao@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:21:58 -08:00
Lihao Liang
f4f2cf8bd8 doc: Fix typo in rcutorture documentation
Signed-off-by: Lihao Liang <lianglihao@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-02-20 16:10:19 -08:00
Jeremy Cline
b1a2ce8257 tools/libbpf: Avoid possibly using uninitialized variable
Fixes a GCC maybe-uninitialized warning introduced by 48cca7e44f.
"text" is only initialized inside the if statement so only print debug
info there.

Fixes: 48cca7e44f ("libbpf: add support for bpf_call")
Signed-off-by: Jeremy Cline <jeremy@jcline.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-02-20 21:08:20 +01:00
David S. Miller
f5c0c6f429 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-02-19 18:46:11 -05:00
Jaroslav Škarvada
66dfdff03d perf tools: Add Python 3 support
Added Python 3 support while keeping Python 2.7 compatibility.

Committer notes:

This doesn't make it to auto detect python 3, one has to explicitely ask
it to build with python 3 devel files, here are the instructions
provided by Jaroslav:

 ---
  $ cp -a tools/perf tools/python3-perf
  $ make V=1 prefix=/usr -C tools/perf PYTHON=/usr/bin/python2 all
  $ make V=1 prefix=/usr -C tools/python3-perf PYTHON=/usr/bin/python3 all
  $ make V=1 prefix=/usr -C tools/python3-perf PYTHON=/usr/bin/python3 DESTDIR=%{buildroot} install-python_ext
  $ make V=1 prefix=/usr -C tools/perf PYTHON=/usr/bin/python2 DESTDIR=%{buildroot} install-python_ext
 ---

We need to make this automatic, just like the existing tests for checking if
the python2 devel files are in place, allowing the build with python3 if
available, fallbacking to python2 and then just disabling it if none are
available.

So, using the PYTHON variable to build it using O= we get:

Before this patch:

  $ rpm -q python3 python3-devel
  python3-3.6.4-7.fc27.x86_64
  python3-devel-3.6.4-7.fc27.x86_64
  $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf PYTHON=/usr/bin/python3 -C tools/perf install-bin
  make: Entering directory '/home/acme/git/linux/tools/perf'
  <SNIP>
  Makefile.config:670: Python 3 is not yet supported; please set
  Makefile.config:671: PYTHON and/or PYTHON_CONFIG appropriately.
  Makefile.config:672: If you also have Python 2 installed, then
  Makefile.config:673: try something like:
  Makefile.config:674:
  Makefile.config:675:   make PYTHON=python2
  Makefile.config:676:
  Makefile.config:677: Otherwise, disable Python support entirely:
  Makefile.config:678:
  Makefile.config:679:   make NO_LIBPYTHON=1
  Makefile.config:680:
  Makefile.config:681: *** .  Stop.
  make[1]: *** [Makefile.perf:212: sub-make] Error 2
  make: *** [Makefile:110: install-bin] Error 2
  make: Leaving directory '/home/acme/git/linux/tools/perf'
  $

After:

  $ make O=/tmp/build/perf PYTHON=python3 -C tools/perf install-bin
  $ ldd ~/bin/perf | grep python
	libpython3.6m.so.1.0 => /lib64/libpython3.6m.so.1.0 (0x00007f58a31e8000)
  $ rpm -qf /lib64/libpython3.6m.so.1.0
  python3-libs-3.6.4-7.fc27.x86_64
  $

Now verify that when using the binding the right ELF file is loaded,
using perf trace:

  $ perf trace -e open* perf test python
     0.051 ( 0.016 ms): perf/3927 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
<SNIP>
  18: 'import perf' in python                               :
     8.849 ( 0.013 ms): sh/3929 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC           ) = 3
<SNIP>
    25.572 ( 0.008 ms): python3/3931 openat(dfd: CWD, filename: /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so, flags: CLOEXEC) = 3
<SNIP>
 Ok
<SNIP>
  $

And using tools/perf/python/twatch.py, to show PERF_RECORD_ metaevents:

  $ python3 tools/perf/python/twatch.py
  cpu: 3, pid: 16060, tid: 16060 { type: fork, pid: 5207, ppid: 16060, tid: 5207, ptid: 16060, time: 10798513015459}
  cpu: 3, pid: 16060, tid: 16060 { type: fork, pid: 5208, ppid: 16060, tid: 5208, ptid: 16060, time: 10798513562503}
  cpu: 0, pid: 5208, tid: 5208 { type: comm, pid: 5208, tid: 5208, comm: grep }
  cpu: 2, pid: 5207, tid: 5207 { type: comm, pid: 5207, tid: 5207, comm: ps }
  cpu: 2, pid: 5207, tid: 5207 { type: exit, pid: 5207, ppid: 5207, tid: 5207, ptid: 5207, time: 10798551337484}
  cpu: 3, pid: 5208, tid: 5208 { type: exit, pid: 5208, ppid: 5208, tid: 5208, ptid: 5208, time: 10798551292153}
  cpu: 3, pid: 601, tid: 601 { type: fork, pid: 5209, ppid: 601, tid: 5209, ptid: 601, time: 10801779977324}
  ^CTraceback (most recent call last):
    File "tools/perf/python/twatch.py", line 68, in <module>
      main()
    File "tools/perf/python/twatch.py", line 40, in main
      evlist.poll(timeout = -1)
  KeyboardInterrupt
  $

  # ps ax|grep twatch
 5197 pts/8    S+     0:00 python3 tools/perf/python/twatch.py
  # ls -la /proc/5197/smaps
  -r--r--r--. 1 acme acme 0 Feb 19 13:14 /proc/5197/smaps
  # grep python /proc/5197/smaps
  558111307000-558111309000 r-xp 00000000 fd:00 3151710  /usr/bin/python3.6
  558111508000-558111509000 r--p 00001000 fd:00 3151710  /usr/bin/python3.6
  558111509000-55811150a000 rw-p 00002000 fd:00 3151710  /usr/bin/python3.6
  7ffad6fc1000-7ffad7008000 r-xp 00000000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
  7ffad7008000-7ffad7207000 ---p 00047000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
  7ffad7207000-7ffad7208000 r--p 00046000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
  7ffad7208000-7ffad7215000 rw-p 00047000 00:2d 220196   /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so
  7ffadea77000-7ffaded3d000 r-xp 00000000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
  7ffaded3d000-7ffadef3c000 ---p 002c6000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
  7ffadef3c000-7ffadef42000 r--p 002c5000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
  7ffadef42000-7ffadefa5000 rw-p 002cb000 fd:00 3151795  /usr/lib64/libpython3.6m.so.1.0
  #

And with this patch, but building normally, without specifying the
PYTHON=python3 part, which will make it use python2 if its devel files are
available, like in this test:

  $ make O=/tmp/build/perf -C tools/perf install-bin
  $ ldd ~/bin/perf | grep python
	libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007f6a44410000)
  $ ldd /tmp/build/perf/python_ext_build/lib/perf.so  | grep python
	libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007fed28a2c000)
  $

  [acme@jouet perf]$ tools/perf/python/twatch.py
  cpu: 0, pid: 2817, tid: 2817 { type: fork, pid: 2817, ppid: 2817, tid: 8910, ptid: 2817, time: 11126454335306}
  cpu: 0, pid: 2817, tid: 2817 { type: comm, pid: 2817, tid: 8910, comm: worker }
  $ ps ax | grep twatch.py
   8909 pts/8    S+     0:00 /usr/bin/python tools/perf/python/twatch.py
  $ grep python /proc/8909/smaps
  5579de658000-5579de659000 r-xp 00000000 fd:00 3156044  /usr/bin/python2.7
  5579de858000-5579de859000 r--p 00000000 fd:00 3156044  /usr/bin/python2.7
  5579de859000-5579de85a000 rw-p 00001000 fd:00 3156044  /usr/bin/python2.7
  7f0de01f7000-7f0de023e000 r-xp 00000000 00:2d 230695   /tmp/build/perf/python/perf.so
  7f0de023e000-7f0de043d000 ---p 00047000 00:2d 230695   /tmp/build/perf/python/perf.so
  7f0de043d000-7f0de043e000 r--p 00046000 00:2d 230695   /tmp/build/perf/python/perf.so
  7f0de043e000-7f0de044b000 rw-p 00047000 00:2d 230695   /tmp/build/perf/python/perf.so
  7f0de6f0f000-7f0de6f13000 r-xp 00000000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
  7f0de6f13000-7f0de7113000 ---p 00004000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
  7f0de7113000-7f0de7114000 r--p 00004000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
  7f0de7114000-7f0de7115000 rw-p 00005000 fd:00 134975   /usr/lib64/python2.7/lib-dynload/_localemodule.so
  7f0de7e73000-7f0de8052000 r-xp 00000000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
  7f0de8052000-7f0de8251000 ---p 001df000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
  7f0de8251000-7f0de8255000 r--p 001de000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
  7f0de8255000-7f0de8291000 rw-p 001e2000 fd:00 3173292  /usr/lib64/libpython2.7.so.1.0
  $

Signed-off-by: Jaroslav Škarvada <jskarvad@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
LPU-Reference: 20180119205641.24242-1-jskarvad@redhat.com
Link: https://lkml.kernel.org/n/tip-8d7dt9kqp83vsz25hagug8fu@git.kernel.org
[ Removed explicit check for python version, allowing it to really build with python3 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19 12:28:23 -03:00
Arnaldo Carvalho de Melo
d2ed5d2bdc perf python: Make twatch.py work with both python2 and python3
Will be used to test patches allowing to build perf with python3, so
that we make sure that we can build with both versions.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-c2ynv0ozr3eifzsyit6qgh3h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19 12:28:08 -03:00
Changbin Du
63cd02d84b perf ftrace: Append an EOL when write tracing files
Before this change, the '--graph-funcs', '--nograph-funcs' and
'--trace-funcs' options didn't work as expected when the <func> doesn't
exist. Because the kernel side hid possible errors.

  $ sudo ./perf ftrace -a --graph-depth 1 --graph-funcs abcdefg
   0)   0.140 us    |  rcu_all_qs();
   3)   0.304 us    |  mutex_unlock();
   0)   0.153 us    |  find_vma();
   3)   0.088 us    |  __fsnotify_parent();
   0)   6.145 us    |  handle_mm_fault();
   3)   0.089 us    |  fsnotify();
   3)   0.161 us    |  __sb_end_write();
   3)   0.710 us    |  SyS_close();
   3)   7.848 us    |  exit_to_usermode_loop();

On the example above, I specified the function filter 'abcdefg' but all
functions are enabled. The expected result is for all functions to be
filtered, since there is no such function ('abcdefg')

The original fix is to make the kernel support '\0' as end of string:
https://lkml.org/lkml/2018/1/16/116

But above fix cannot be compatible with old kernels. Then Namhyung Kim
suggest adding a space after function name.

This patch will append an '\n' when write tracing file. After this fix,
the perf will report correct error state. Also let it print an error if
reset_tracing_files() fails.

Committer testing:

Now it prints:

  # perf ftrace -a --graph-depth 1 --graph-funcs abcdefg
  failed to set tracing filters
  #

And for an existing function:

  # perf ftrace -a --graph-depth 1 --graph-funcs SyS_open
   3)               |  SyS_open() {
   3) ! 494.899 us  |  }
   0) + 23.910 us   |  SyS_open();
   1) + 17.115 us   |  SyS_open();
   1) + 13.900 us   |  SyS_open();
   ------------------------------------------
   3)  qemu-sy-2817  =>  pickup-1290
   ------------------------------------------

   3) + 20.021 us   |  SyS_open();
  #

Signed-off-by: Changbin Du <changbin.du@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1519007609-14551-1-git-send-email-changbin.du@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19 09:49:12 -03:00
Namhyung Kim
1d12cec6ce perf machine: Fix paranoid check in machine__set_kernel_mmap()
The machine__set_kernel_mmap() is to setup addresses of the kernel map
using external info.  But it has a check when the address is given from
an incorrect input which should have the start and end address of 0
(i.e. machine__process_kernel_mmap_event).

But we also use the end address of 0 for a valid input so change it to
check both start and end addresses.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20180219101936.GD1583@sejong
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19 09:17:46 -03:00
Thomas Richter
47812e0091 perf s390: Fix reading cpuid model information
Commit eca0fa28cd (perf record: Provide detailed information on s390
CPU") fixed a  build error on Ubuntu. However the fix uses the wrong
size to print the model information.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: eca0fa28cd ("perf record: Provide detailed information on s390 CPU")
Link: http://lkml.kernel.org/r/20180219102444.96900-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19 09:16:01 -03:00
Ingo Molnar
11737ca9e3 Merge tag 'perf-core-for-mingo-4.17-20180216' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

- Fix wrong jump arrow in systems with branch records with cycles,
  i.e. Intel's >= Skylake (Jin Yao)

- Fix 'perf record --per-thread' problem introduced when
  implementing 'perf stat --per-thread (Jin Yao)

- Use arch__compare_symbol_names() to fix 'perf test vmlinux',
  that was using strcmp(symbol names) while the dso routines
  doing symbol lookups used the arch overridable one, making
  this test fail in architectures that overrided that function
  with something other than strcmp() (Jiri Olsa)

- Add 'perf script --show-round-event' to display
  PERF_RECORD_FINISHED_ROUND entries (Jiri Olsa)

- Fix dwarf unwind for stripped binaries in 'perf test' (Jiri Olsa)

- Use ordered_events for 'perf report --tasks', otherwise we may get
  artifacts when PERF_RECORD_FORK gets processed before PERF_RECORD_COMM
  (when they got recorded in different CPUs) (Jiri Olsa)

- Add support to display group output for non group events, i.e.
  now when one uses 'perf report --group' on a perf.data file
  recorded without explicitly grouping events with {} (e.g.
  "perf record -e '{cycles,instructions}'" get the same output
  that would produce, i.e. see all those non-grouped events in
  multiple columns, at the same time (Jiri Olsa)

- Skip non-address kallsyms entries, e.g. '(null)' for !root (Jiri Olsa)

- Kernel maps fixes wrt perf.data(report) versus live system (top)
  (Jiri Olsa)

- Fix memory corruption when using 'perf record -j call -g -a <application>'
  followed by 'perf report --branch-history' (Jiri Olsa)

- ARM CoreSight fixes (Mathieu Poirier)

- Add inject capability for CoreSight Traces (Robert Waker)

- Update documentation for use of 'perf' + ARM CoreSight (Robert Walker)

- Man pages fixes (Sangwon Hong, Jaecheol Shin)

- Fix some 'perf test' cases on s/390 and x86_64 (some backtraces
  changed with a glibc update) (Thomas Richter)

- Add detailed CPUID info in the 'perf.data' headers for s/390 to
  then use it in 'perf annotate' (Thomas Richter)

- Add '--interval-count N' to 'perf stat', to use with -I, i.e.
  'perf stat -I 1000 --interval-count 2' will show stats every
   1000ms, two times (yuzhoujian)

- Add 'perf stat --timeout Nms', that will run for that many
  milliseconds and then stop, printing the counters (yuzhoujian)

- Fix description for 'perf report --mem-modex (Andi Kleen)

- Use a wildcard to remove the vfs_getname probe in the
  'perf test' shell based test cases (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-17 11:39:47 +01:00
Ingo Molnar
7057bb975d Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-17 11:39:28 +01:00
Sowmini Varadhan
dfb8434b0a selftests/net: add zerocopy support for PF_RDS test case
Send a cookie with sendmsg() on PF_RDS sockets, and process the
returned batched cookies in do_recv_completion()

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 16:04:17 -05:00
Sowmini Varadhan
b16ac92040 selftests/net: add support for PF_RDS sockets
Add support for basic PF_RDS client-server testing in msg_zerocopy

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-16 16:04:17 -05:00
Arnaldo Carvalho de Melo
21316ac680 perf tests shell lib: Use a wildcard to remove the vfs_getname probe
In some situations the vfs_getname is being added both as requested and
with a _1 suffix (inlines?):

  probe:vfs_getname_1  (on getname_flags:63@acme/git/linux/fs/namei.c with pathname)

This ends up making the cleanup to miss that one, as it removes just
'probe:vfs_getname', which makes the second test to use this probe point
to fail, since it finds that leftover from the first test, use a
wildcard to remove both.

Before:

  # perf test 60 61 62 63
  60: Use vfs_getname probe to get syscall args filenames   : FAILED!
  61: probe libc's inet_pton & backtrace it with ping       : Ok
  62: Check open filename arg using perf trace + vfs_getname: FAILED!
  63: Add vfs_getname probe to get syscall args filenames   : Ok

After:

  # perf test 60 61 62 63
  60: Use vfs_getname probe to get syscall args filenames   : Ok
  61: probe libc's inet_pton & backtrace it with ping       : Ok
  62: Check open filename arg using perf trace + vfs_getname: Ok
  63: Add vfs_getname probe to get syscall args filenames   : Ok
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-2k5kutwr4ds36adiakyb4yvy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 15:31:12 -03:00
Thomas Richter
0f19a038af perf test: Fix test case inet_pton to accept inlines.
Using Fedora 27 and latest Linux kernel the test case
trace+probe_libc_inet_pton.sh fails again on s390.  This time is the
inlining of functions which does not match.  After an update of the
glibc (from 2.26-16 to 2.26-24) the output is different

The expected output is:

             __inet_pton (/usr/lib64/libc-2.26.so)
             gaih_inet (inlined)
             ....

The actual output is:

  1 packets transmitted, 1 received, 0% packet loss, time 0ms
  rtt min/avg/max/mdev = 0.061/0.061/0.061/0.000 ms
       0.000 probe_libc:inet_pton:(3ffb2140448))
             __inet_pton (inlined)
             gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
             ...

Fix this by being less strict on 'inlined' verses library name and
accept both

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180214070303.55757-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 15:16:58 -03:00
Thomas Richter
b3be39c51c perf test: Fix test case 23 for s390 z/VM or KVM guests
On s390 perf can be executed on a LPAR with support for hardware events
(i. e. cycles) or on a z/VM or KVM guest where no hardware events are
supported. In this environment use software event named cpu-clock for
this test case.

Use the cpuid infrastructure functions to determine the cpuid on s390
which contains an indication of the cpu counter facility availability.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180213151419.80737-4-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 15:16:57 -03:00
Thomas Richter
4cb7d3ecfc perf cpuid: Introduce a platform specific cpuid compare function
The function get_cpuid_str() is called by perf_pmu__getcpuid() and on
s390 returns a complete description of the CPU and its capabilities,
which is a comma separated list.

To map the CPU type with the value defined in the
pmu-events/arch/s390/mapfile.csv, introduce an architecture specific
cpuid compare function named strcmp_cpuid_str()

The currently used regex algorithm is defined as the weak default and
will be used if no platform specific one is defined. This matches the
current behavior.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180213151419.80737-3-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 15:16:57 -03:00
Thomas Richter
c59124fa59 perf annotate: Scan cpuid for s390 and save machine type
Scan the cpuid string and extract the type number for later use.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180213151419.80737-2-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 15:16:57 -03:00
Thomas Richter
eca0fa28cd perf record: Provide detailed information on s390 CPU
When perf record ... is setup to record data, the s390 cpu information
was a fixed string "IBM/S390".

Replace this string with one containing more information about the
machine. The information included in the cpuid is a comma separated
list:

   manufacturer,type,model-capacity,model[,version,authorization]
with

- manufacturer: up to 16 byte name of the manufacturer (IBM).
- type: a four digit number refering to the machine
  generation.
- model-capacitiy: up to 16 characters describing number
  of cpus etc.
- model: up to 16 characters describing model.
- version: the CPU-MF counter facility version number,
  available on LPARs only, omitted on z/VM guests.
- authorization: the CPU-MF counter facility authorization level,
  available on LPARs only, omitted on z/VM guests.

Before:

  [root@s8360047 perf]# ./perf record -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (4 samples) ]
  [root@s8360047 perf]# ./perf report --header | fgrep cpuid
   # cpuid : IBM/S390
  [root@s8360047 perf]#

After:

  [root@s35lp76 perf]# ./perf report --header|fgrep cpuid
   # cpuid : IBM,3906,704,M03,3.5,002f
  [root@s35lp76 perf]#

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180213151419.80737-1-tmricht@linux.vnet.ibm.com
[ Use scnprintf instead of strncat to fix build errors on gcc GNU C99 5.4.0 20160609 -march=zEC12 -m64 -mzarch -ggdb3 -O6 -std=gnu99 -fPIC -fno-omit-frame-pointer -funwind-tables -fstack-protector-all ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 15:15:23 -03:00
Ravi Bangoria
4281da235e perf trace powerpc: Use generated syscall table
This should speed up accessing new system calls introduced with the
kernel rather than waiting for libaudit updates to include them.

It also enables users to specify wildcards, for example, perf trace -e
'open*', just like was already possible on x86 and s390.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20180129083417.31240-4-ravi.bangoria@linux.vnet.ibm.com
[ Do it for ppc32 as well ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:50 -03:00
Ravi Bangoria
8e2ff72aa3 perf powerpc: Generate system call table from asm/unistd.h
This should speed up accessing new system calls introduced with the
kernel rather than waiting for libaudit updates to include them.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20180129083417.31240-3-ravi.bangoria@linux.vnet.ibm.com
[ Made it generate syscall_32.c as well to fix the build on 32-bit ppc ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:48 -03:00
Ravi Bangoria
1350fb7d1b tools include powerpc: Grab a copy of arch/powerpc/include/uapi/asm/unistd.h
Will be used for generating the syscall id/string translation table.

Committer notes:

Update it already to catch with these csets applied since Ravi first
submitted this patch:

  3350eb2ea1 powerpc: sys_pkey_mprotect() system call
  9499ec1b5e powerpc: sys_pkey_alloc() and sys_pkey_free() system calls

So now 'perf trace' on ppc now knows about the pkey_ syscals.

Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20180129083417.31240-2-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:47 -03:00
Jiri Olsa
e3ebaa4651 perf report: Fix memory corruption in --branch-history mode --branch-history
Jin Yao reported memory corrupton in perf report with
branch info used for stack trace:

  > Following command lines will cause perf crash.

  > perf record -j call -g -a <application>
  > perf report --branch-history
  >
  > *** Error in `perf': double free or corruption (!prev): 0x00000000104aa040 ***
  > ======= Backtrace: =========
  > /lib/x86_64-linux-gnu/libc.so.6(+0x77725)[0x7f6b37254725]
  > /lib/x86_64-linux-gnu/libc.so.6(+0x7ff4a)[0x7f6b3725cf4a]
  > /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f6b37260abc]
  > perf[0x51b914]
  > perf(hist_entry_iter__add+0x1e5)[0x51f305]
  > perf[0x43cf01]
  > perf[0x4fa3bf]
  > perf[0x4fa923]
  > perf[0x4fd396]
  > perf[0x4f9614]
  > perf(perf_session__process_events+0x89e)[0x4fc38e]
  > perf(cmd_report+0x15d2)[0x43f202]
  > perf[0x4a059f]
  > perf(main+0x631)[0x427b71]
  > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f6b371fd830]
  > perf(_start+0x29)[0x427d89]

For the cumulative output, we allocate the he_cache array based on the
--max-stack option value and populate it with data from 'callchain_cursor'.

The --max-stack option value does not ensure now the limit for number of
callchain_cursor nodes, so the cumulative iter code will allocate smaller array
than it's actually needed and cause above corruption.

I think the --max-stack limit does not apply here anyway, because we add
callchain data as normal hist entries, while the --max-stack control the limit
of single entry callchain depth.

Using the callchain_cursor.nr as he_cache array count to fix this. Also
removing struct hist_entry_iter::max_stack, because there's no longer any use
for it.

We need more fixes to ensure that the branch stack code follows properly the
logic of --max-stack, which is not the case at the moment.

Original-patch-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180216123619.GA9945@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:47 -03:00
Jin Yao
b40982e846 perf report: Fix wrong jump arrow
When we use perf report interactive annotate view, we can see
the position of jump arrow is not correct. For example,

1. perf record -b ...
2. perf report
3. In interactive mode, select Annotate 'function'

Percent│ IPC Cycle
       │                                if (flag)
  1.37 │0.4┌──   1      ↓ je     82
       │   │                                    x += x / y + y / x;
  0.00 │0.4│  1310        movsd  (%rsp),%xmm0
  0.00 │0.4│   565        movsd  0x8(%rsp),%xmm4
       │0.4│              movsd  0x8(%rsp),%xmm1
       │0.4│              movsd  (%rsp),%xmm3
       │0.4│              divsd  %xmm4,%xmm0
  0.00 │0.4│   579        divsd  %xmm3,%xmm1
       │0.4│              movsd  (%rsp),%xmm2
       │0.4│              addsd  %xmm1,%xmm0
       │0.4│              addsd  %xmm2,%xmm0
  0.00 │0.4│              movsd  %xmm0,(%rsp)
       │   │                    volatile double x = 1212121212, y = 121212;
       │   │
       │   │                    s_randseed = time(0);
       │   │                    srand(s_randseed);
       │   │
       │   │                    for (i = 0; i < 2000000000; i++) {
  1.37 │0.4└─→      82:   sub    $0x1,%ebx
 28.21 │0.48    17      ↑ jne    38

The jump arrow in above example is not correct. It should add the
width of IPC and Cycle.

With this patch, the result is:

Percent│ IPC Cycle
       │                                if (flag)
  1.37 │0.48     1     ┌──je     82
       │               │                        x += x / y + y / x;
  0.00 │0.48  1310     │  movsd  (%rsp),%xmm0
  0.00 │0.48   565     │  movsd  0x8(%rsp),%xmm4
       │0.48           │  movsd  0x8(%rsp),%xmm1
       │0.48           │  movsd  (%rsp),%xmm3
       │0.48           │  divsd  %xmm4,%xmm0
  0.00 │0.48   579     │  divsd  %xmm3,%xmm1
       │0.48           │  movsd  (%rsp),%xmm2
       │0.48           │  addsd  %xmm1,%xmm0
       │0.48           │  addsd  %xmm2,%xmm0
  0.00 │0.48           │  movsd  %xmm0,(%rsp)
       │               │        volatile double x = 1212121212, y = 121212;
       │               │
       │               │        s_randseed = time(0);
       │               │        srand(s_randseed);
       │               │
       │               │        for (i = 0; i < 2000000000; i++) {
  1.37 │0.48        82:└─→sub    $0x1,%ebx
 28.21 │0.48    17      ↑ jne    38

Committer notes:

Please note that only from LBRv5 (according to Jiri) onwards, i.e. >=
Skylake is that we'll have the cycles counts in each branch record
entry, so to see the Cycles and IPC columns, and be able to test this
patch, one need a capable hardware.

While applying this I first tested it on a Broadwell class machine and
couldn't get those columns, will add code to the annotate browser to
warn the user about that, i.e. you have branch records, but no cycles,
use a more recent hardware to get the cycles and IPC columns.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1517223473-14750-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16 14:55:47 -03:00