Commit Graph

15026 Commits

Author SHA1 Message Date
Adrian Hunter
22916fdb9c perf kcore_copy: Amend the offset of sections that remap kernel text
x86 PTI entry trampolines all map to the same physical page. If that is
reflected in the program headers of /proc/kcore, then do the same for the
copy of kcore.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:44 -03:00
Adrian Hunter
a1a3a0624e perf kcore_copy: Copy x86 PTI entry trampoline sections
Identify and copy any sections for x86 PTI entry trampolines.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:43 -03:00
Adrian Hunter
b4503cdb67 perf kcore_copy: Get rid of kernel_map
In preparation to add more program headers, get rid of kernel_map and
modules_map by moving ->kernel_map and ->modules_map to newly allocated
entries in the ->phdrs list.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:43 -03:00
Adrian Hunter
d2c959803c perf kcore_copy: Iterate phdrs
In preparation to add more program headers, iterate phdrs instead of
assuming there is only one for the kernel text and one for the modules.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:42 -03:00
Adrian Hunter
15acef6c37 perf kcore_copy: Layout sections
In preparation to add more program headers, layout the relative offset
of each section.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:42 -03:00
Adrian Hunter
c9dd1d8949 perf kcore_copy: Calculate offset from phnum
In preparation to add more program headers, calculate offset from the
number of phdrs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-13-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:41 -03:00
Adrian Hunter
6e97957d3d perf kcore_copy: Keep a count of phdrs
In preparation to add more program headers, keep a count of phdrs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:41 -03:00
Adrian Hunter
f683820948 perf kcore_copy: Keep phdr data in a list
Currently, kcore_copy makes 2 program headers, one for the kernel text
(namely kernel_map) and one for the modules (namely modules_map). Now
more program headers are needed, but treating each program header as a
special case results in much more code.

Instead, in preparation to add more program headers, change to keep
program header data (phdr_data) in a list.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-11-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:40 -03:00
Jin Yao
787e4da9f9 perf annotate: Show group event string for stdio
When we enable the group, for tui/stdio2, the output first line includes
the group event string. While for stdio, it will show only one event.

For example,

perf record -e cycles,branches ./div
perf annotate --group --stdio

    Percent |      Source code & Disassembly of div for cycles (44407 samples)
    ......

The first line doesn't include the event 'branches'.

With this patch, it will show the correct group even string.

perf annotate --group --stdio

    Percent |      Source code & Disassembly of div for cycles, branches (44407 samples)
    ......

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526989115-14435-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:40 -03:00
Adrian Hunter
a8ce99b0ee perf machine: Synthesize and process mmap events for x86 PTI entry trampolines
Like the kernel text, the location of x86 PTI entry trampolines must be
recorded in the perf.data file. Like the kernel, synthesize a mmap event
for that, and add processing for it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:26:39 -03:00
Adrian Hunter
1c5aae7710 perf machine: Create maps for x86 PTI entry trampolines
Create maps for x86 PTI entry trampolines, based on symbols found in
kallsyms. It is also necessary to keep track of whether the trampolines
have been mapped particularly when the kernel dso is kcore.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-9-git-send-email-adrian.hunter@intel.com
[ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23 10:24:08 -03:00
Sirio Balmelli
167381f3ea selftests/bpf: Makefile fix "missing" headers on build with -idirafter
Selftests fail to build on several distros/architectures because of
	missing headers files.

On a Ubuntu/x86_64 some missing headers are:
	asm/byteorder.h, asm/socket.h, asm/sockios.h

On a Debian/arm32 build already fails at sys/cdefs.h

In both cases, these already exist in /usr/include/<arch-specific-dir>,
but Clang does not include these when using '-target bpf' flag,
since it is no longer compiling against the host architecture.

The solution is to:

- run Clang without '-target bpf' and extract the include chain for the
current system

- add these to the bpf build with '-idirafter'

The choice of -idirafter is to catch this error without injecting
unexpected include behavior: if an arch-specific tree is built
for bpf in the future, this will be correctly found by Clang.

Signed-off-by: Sirio Balmelli <sirio@b-ad.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-23 14:32:09 +02:00
Anders Roxell
1a2b80ecc7 selftests: net: reuseport_bpf_numa: don't fail if no numa support
The reuseport_bpf_numa test case fails there's no numa support.  The
test shouldn't fail if there's no support it should be skipped.

Fixes: 3c2c3c16aa ("reuseport, bpf: add test case for bpf_get_numa_node_id")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-23 12:21:02 +02:00
Martin KaFai Lau
61746dbe1a bpf: btf: Add tests for the btf uapi changes
This patch does the followings:
1. Modify libbpf and test_btf to reflect the uapi changes in btf
2. Add test for the btf_header changes
3. Add tests for array->index_type
4. Add err_str check to the tests
5. Fix a 4 bytes hole in "struct test #1" by swapping "m" and "n"

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-23 12:03:32 +02:00
Martin KaFai Lau
f03b15d34b bpf: btf: Sync bpf.h and btf.h to tools
This patch sync the uapi bpf.h and btf.h to tools.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-23 12:03:32 +02:00
Dan Williams
5d8beee20d x86, nfit_test: Add unit test for memcpy_mcsafe()
Given the fact that the ACPI "EINJ" (error injection) facility is not
universally available, implement software infrastructure to validate the
memcpy_mcsafe() exception handling implementation.

For each potential read exception point in memcpy_mcsafe(), inject a
emulated exception point at the address identified by 'mcsafe_inject'
variable. With this infrastructure implement a test to validate that the
'bytes remaining' calculation is correct for a range of various source
buffer alignments.

This code is compiled out by default. The CONFIG_MCSAFE_DEBUG
configuration symbol needs to be manually enabled by editing
Kconfig.debug. I.e. this functionality can not be accidentally enabled
by a user / distro, it's only for development.

Cc: <x86@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2018-05-22 23:18:31 -07:00
David Ahern
abb1860aac selftests: fib_tests: Add ipv4 route add append replace tests
Add IPv4 route tests covering add, append and replace permutations.
Assumes the ability to add a basic single path route works; this is
required for example when adding an address to an interface.

$ fib_tests.sh -t ipv4_rt

IPv4 route add / append tests
    TEST: Attempt to add duplicate route - gw                           [ OK ]
    TEST: Attempt to add duplicate route - dev only                     [ OK ]
    TEST: Attempt to add duplicate route - reject route                 [ OK ]
    TEST: Add new nexthop for existing prefix                           [ OK ]
    TEST: Append nexthop to existing route - gw                         [ OK ]
    TEST: Append nexthop to existing route - dev only                   [ OK ]
    TEST: Append nexthop to existing route - reject route               [ OK ]
    TEST: Append nexthop to existing reject route - gw                  [ OK ]
    TEST: Append nexthop to existing reject route - dev only            [ OK ]
    TEST: add multipath route                                           [ OK ]
    TEST: Attempt to add duplicate multipath route                      [ OK ]
    TEST: Route add with different metrics                              [ OK ]
    TEST: Route delete with metric                                      [ OK ]

IPv4 route replace tests
    TEST: Single path with single path                                  [ OK ]
    TEST: Single path with multipath                                    [ OK ]
    TEST: Single path with reject route                                 [ OK ]
    TEST: Single path with single path via multipath attribute          [ OK ]
    TEST: Invalid nexthop                                               [ OK ]
    TEST: Single path - replace of non-existent route                   [ OK ]
    TEST: Multipath with multipath                                      [ OK ]
    TEST: Multipath with single path                                    [ OK ]
    TEST: Multipath with single path via multipath attribute            [ OK ]
    TEST: Multipath with reject route                                   [ OK ]
    TEST: Multipath - invalid first nexthop                             [ OK ]
    TEST: Multipath - invalid second nexthop                            [ OK ]
    TEST: Multipath - replace of non-existent route                     [ OK ]

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 14:44:19 -04:00
David Ahern
f9a5a9d89f selftests: fib_tests: Add ipv6 route add append replace tests
Add IPv6 route tests covering add, append and replace permutations.
Assumes the ability to add a basic single path route works; this is
required for example when adding an address to an interface.

$ fib_tests.sh -t ipv6_rt

IPv6 route add / append tests
    TEST: Attempt to add duplicate route - gw                           [ OK ]
    TEST: Attempt to add duplicate route - dev only                     [ OK ]
    TEST: Attempt to add duplicate route - reject route                 [ OK ]
    TEST: Add new route for existing prefix (w/o NLM_F_EXCL)            [ OK ]
    TEST: Append nexthop to existing route - gw                         [ OK ]
    TEST: Append nexthop to existing route - dev only                   [ OK ]
    TEST: Append nexthop to existing route - reject route               [ OK ]
    TEST: Append nexthop to existing reject route - gw                  [ OK ]
    TEST: Append nexthop to existing reject route - dev only            [ OK ]
    TEST: Add multipath route                                           [ OK ]
    TEST: Attempt to add duplicate multipath route                      [ OK ]
    TEST: Route add with different metrics                              [ OK ]
    TEST: Route delete with metric                                      [ OK ]

IPv6 route replace tests
    TEST: Single path with single path                                  [ OK ]
    TEST: Single path with multipath                                    [ OK ]
    TEST: Single path with reject route                                 [ OK ]
    TEST: Single path with single path via multipath attribute          [ OK ]
    TEST: Invalid nexthop                                               [ OK ]
    TEST: Single path - replace of non-existent route                   [ OK ]
    TEST: Multipath with multipath                                      [ OK ]
    TEST: Multipath with single path                                    [ OK ]
    TEST: Multipath with single path via multipath attribute            [ OK ]
    TEST: Multipath with reject route                                   [ OK ]
    TEST: Multipath - invalid first nexthop                             [ OK ]
    TEST: Multipath - invalid second nexthop                            [ OK ]
    TEST: Multipath - replace of non-existent route                     [ OK ]

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 14:44:19 -04:00
David Ahern
7df15e6c3e selftests: fib_tests: Add option to pause after each test
Add option to pause after each test before cleanup is done. Allows
user to do manual inspection or more ad-hoc testing after each test
with the setup in tact.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 14:44:19 -04:00
David Ahern
1c7447b4e8 selftests: fib_tests: Add command line options
Add command line options for controlling pause on fail, controlling
specific tests to run and verbose mode rather than relying on environment
variables.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 14:44:19 -04:00
David Ahern
37ce42c14e selftests: fib_tests: Add success-fail counts
As more tests are added, it is convenient to have a tally at the end.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22 14:44:18 -04:00
Adrian Hunter
5759a6820a perf machine: Allow for extra kernel maps
Identify extra kernel maps by name so that they can be distinguished
from the kernel map and module maps.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:59:22 -03:00
Adrian Hunter
4d004365e2 perf machine: Fix map_groups__split_kallsyms() for entry trampoline symbols
When kernel symbols are derived from /proc/kallsyms only (not using
vmlinux or /proc/kcore) map_groups__split_kallsyms() is used. However
that function makes assumptions that are not true with entry trampoline
symbols. For now, remove the entry trampoline symbols at that point, as
they are no longer needed at that point.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:55:59 -03:00
Adrian Hunter
4d99e41365 perf machine: Workaround missing maps for x86 PTI entry trampolines
On x86_64 the PTI entry trampolines are not in the kernel map created by
perf tools. That results in the addresses having no symbols and prevents
annotation.  It also causes Intel PT to have decoding errors at the
trampoline addresses.

Workaround that by creating maps for the trampolines.

At present the kernel does not export information revealing where the
trampolines are.  Until that happens, the addresses are hardcoded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:54:22 -03:00
Adrian Hunter
9cecca325e perf machine: Add nr_cpus_avail()
Add a function to return the number of the machine's available CPUs.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526986485-6562-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22 10:52:49 -03:00
David S. Miller
6f6e434aa2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.

TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.

The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.

Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-21 16:01:54 -04:00
Linus Torvalds
3b78ce4a34 Merge branch 'speck-v20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Merge speculative store buffer bypass fixes from Thomas Gleixner:

 - rework of the SPEC_CTRL MSR management to accomodate the new fancy
   SSBD (Speculative Store Bypass Disable) bit handling.

 - the CPU bug and sysfs infrastructure for the exciting new Speculative
   Store Bypass 'feature'.

 - support for disabling SSB via LS_CFG MSR on AMD CPUs including
   Hyperthread synchronization on ZEN.

 - PRCTL support for dynamic runtime control of SSB

 - SECCOMP integration to automatically disable SSB for sandboxed
   processes with a filter flag for opt-out.

 - KVM integration to allow guests fiddling with SSBD including the new
   software MSR VIRT_SPEC_CTRL to handle the LS_CFG based oddities on
   AMD.

 - BPF protection against SSB

.. this is just the core and x86 side, other architecture support will
come separately.

* 'speck-v20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
  bpf: Prevent memory disambiguation attack
  x86/bugs: Rename SSBD_NO to SSB_NO
  KVM: SVM: Implement VIRT_SPEC_CTRL support for SSBD
  x86/speculation, KVM: Implement support for VIRT_SPEC_CTRL/LS_CFG
  x86/bugs: Rework spec_ctrl base and mask logic
  x86/bugs: Remove x86_spec_ctrl_set()
  x86/bugs: Expose x86_spec_ctrl_base directly
  x86/bugs: Unify x86_spec_ctrl_{set_guest,restore_host}
  x86/speculation: Rework speculative_store_bypass_update()
  x86/speculation: Add virtualized speculative store bypass disable support
  x86/bugs, KVM: Extend speculation control for VIRT_SPEC_CTRL
  x86/speculation: Handle HT correctly on AMD
  x86/cpufeatures: Add FEATURE_ZEN
  x86/cpufeatures: Disentangle SSBD enumeration
  x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
  x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
  KVM: SVM: Move spec control call after restore of GS
  x86/cpu: Make alternative_msr_write work for 32-bit code
  x86/bugs: Fix the parameters alignment and missing void
  x86/bugs: Make cpu_show_common() static
  ...
2018-05-21 11:23:26 -07:00
Jin Yao
7ebaf4890f perf annotate: Support '--group' option
With the '--group' option, even for non-explicit group, 'perf annotate'
will enable the group output.

For example,

  $ perf record -e cycles,branches ./div
  $ perf annotate main --stdio --group

                 :            Disassembly of section .text:
                 :
                 :            00000000004004b0 <main>:
                 :            main():
                 :
                 :                    return i;
                 :            }
                 :
                 :            int main(void)
                 :            {
    0.00    0.00 :   4004b0:       push   %rbx
                 :                    int i;
                 :                    int flag;
                 :                    volatile double x = 1212121212, y = 121212;
                 :
                 :                    s_randseed = time(0);
    0.00    0.00 :   4004b1:       xor    %edi,%edi
                 :                    srand(s_randseed);
    0.00    0.00 :   4004b3:       mov    $0x77359400,%ebx
                 :
                 :                    return i;
                 :            }
                 :

But if without --group, there is only one event reported.

  $ perf annotate main --stdio

         :            Disassembly of section .text:
         :
         :            00000000004004b0 <main>:
         :            main():
         :
         :                    return i;
         :            }
         :
         :            int main(void)
         :            {
    0.00 :   4004b0:       push   %rbx
         :                    int i;
         :                    int flag;
         :                    volatile double x = 1212121212, y = 121212;
         :
         :                    s_randseed = time(0);
    0.00 :   4004b1:       xor    %edi,%edi
         :                    srand(s_randseed);
    0.00 :   4004b3:       mov    $0x77359400,%ebx
         :
         :                    return i;
         :            }

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526914666-31839-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21 14:41:25 -03:00
Jin Yao
a26bb0ba70 perf report: Use perf_evlist__force_leader to support '--group'
Since we created a new function perf_evlist__force_leader(), remove the
old code and use that new evlist method.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526914666-31839-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21 14:41:01 -03:00
Jin Yao
e2bdbe80a0 perf evlist: Introduce force_leader() method
For non-explicit group (e.g. those created with -e '{eventA,eventB}'),
'perf report' supports a option '--group' which can enable group output.

We also need to support 'perf annotate' with the same '--group'.

Create a new function perf_evlist__force_leader() which contains common
code to force setting the group leader.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526914666-31839-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21 14:40:54 -03:00
Linus Torvalds
5aef268ace Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix refcounting bug for connections in on-packet scheduling mode of
    IPVS, from Julian Anastasov.

 2) Set network header properly in AF_PACKET's packet_snd, from Willem
    de Bruijn.

 3) Fix regressions in 3c59x by converting to generic DMA API. It was
    relying upon the hack that the PCI DMA interfaces would accept NULL
    for EISA devices. From Christoph Hellwig.

 4) Remove RDMA devices before unregistering netdev in QEDE driver, from
    Michal Kalderon.

 5) Use after free in TUN driver ptr_ring usage, from Jason Wang.

 6) Properly check for missing netlink attributes in SMC_PNETID
    requests, from Eric Biggers.

 7) Set DMA mask before performaing any DMA operations in vmxnet3
    driver, from Regis Duchesne.

 8) Fix mlx5 build with SMP=n, from Saeed Mahameed.

 9) Classifier fixes in bcm_sf2 driver from Florian Fainelli.

10) Tuntap use after free during release, from Jason Wang.

11) Don't use stack memory in scatterlists in tls code, from Matt
    Mullins.

12) Not fully initialized flow key object in ipv4 routing code, from
    David Ahern.

13) Various packet headroom bug fixes in ip6_gre driver, from Petr
    Machata.

14) Remove queues from XPS maps using correct index, from Amritha
    Nambiar.

15) Fix use after free in sock_diag, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (64 commits)
  net: ip6_gre: fix tunnel metadata device sharing.
  cxgb4: fix offset in collecting TX rate limit info
  net: sched: red: avoid hashing NULL child
  sock_diag: fix use-after-free read in __sk_free
  sh_eth: Change platform check to CONFIG_ARCH_RENESAS
  net: dsa: Do not register devlink for unused ports
  net: Fix a bug in removing queues from XPS map
  bpf: fix truncated jump targets on heavy expansions
  bpf: parse and verdict prog attach may race with bpf map update
  bpf: sockmap update rollback on error can incorrectly dec prog refcnt
  net: test tailroom before appending to linear skb
  net: ip6_gre: Fix ip6erspan hlen calculation
  net: ip6_gre: Split up ip6gre_changelink()
  net: ip6_gre: Split up ip6gre_newlink()
  net: ip6_gre: Split up ip6gre_tnl_change()
  net: ip6_gre: Split up ip6gre_tnl_link_config()
  net: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()
  net: ip6_gre: Request headroom in __gre6_xmit()
  selftests/bpf: check return value of fopen in test_verifier.c
  erspan: fix invalid erspan version.
  ...
2018-05-21 08:37:48 -07:00
Michael Neuling
00c946a06e selftests/powerpc: Remove redundant cp_abort test
Paste on POWER9 only works on accelerators and no longer on real
memory. Hence this test is broken so remove it.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-05-21 14:48:02 +10:00
Linus Torvalds
8a6bd2f40e Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "An unfortunately larger set of fixes, but a large portion is
  selftests:

   - Fix the missing clusterid initializaiton for x2apic cluster
     management which caused boot failures due to IPIs being sent to the
     wrong cluster

   - Drop TX_COMPAT when a 64bit executable is exec()'ed from a compat
     task

   - Wrap access to __supported_pte_mask in __startup_64() where clang
     compile fails due to a non PC relative access being generated.

   - Two fixes for 5 level paging fallout in the decompressor:

      - Handle GOT correctly for paging_prepare() and
        cleanup_trampoline()

      - Fix the page table handling in cleanup_trampoline() to avoid
        page table corruption.

   - Stop special casing protection key 0 as this is inconsistent with
     the manpage and also inconsistent with the allocation map handling.

   - Override the protection key wen moving away from PROT_EXEC to
     prevent inaccessible memory.

   - Fix and update the protection key selftests to address breakage and
     to cover the above issue

   - Add a MOV SS self test"

[ Part of the x86 fixes were in the earlier core pull due to dependencies ]

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
  x86/mm: Drop TS_COMPAT on 64-bit exec() syscall
  x86/apic/x2apic: Initialize cluster ID properly
  x86/boot/compressed/64: Fix moving page table out of trampoline memory
  x86/boot/compressed/64: Set up GOT for paging_prepare() and cleanup_trampoline()
  x86/pkeys: Do not special case protection key 0
  x86/pkeys/selftests: Add a test for pkey 0
  x86/pkeys/selftests: Save off 'prot' for allocations
  x86/pkeys/selftests: Fix pointer math
  x86/pkeys: Override pkey when moving away from PROT_EXEC
  x86/pkeys/selftests: Fix pkey exhaustion test off-by-one
  x86/pkeys/selftests: Add PROT_EXEC test
  x86/pkeys/selftests: Factor out "instruction page"
  x86/pkeys/selftests: Allow faults on unknown keys
  x86/pkeys/selftests: Avoid printf-in-signal deadlocks
  x86/pkeys/selftests: Remove dead debugging code, fix dprint_in_signal
  x86/pkeys/selftests: Stop using assert()
  x86/pkeys/selftests: Give better unexpected fault error messages
  x86/selftests: Add mov_to_ss test
  x86/mpx/selftests: Adjust the self-test to fresh distros that export the MPX ABI
  x86/pkeys/selftests: Adjust the self-test to fresh distros that export the pkeys ABI
  ...
2018-05-20 11:28:32 -07:00
Linus Torvalds
95bcce4d42 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling fixes from Thomas Gleixner:

 - fix segfault when processing unknown threads in cs-etm

 - fix "perf test inet_pton" on s390 failing due to missing inline

 - display all available events on 'perf annotate --stdio'

 - add missing newline when parsing an empty BPF program

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Add missing newline when parsing empty BPF proggie
  perf cs-etm: Remove redundant space
  perf cs-etm: Support unknown_thread in cs_etm_auxtrace
  perf annotate: Display all available events on --stdio
  perf test: "probe libc's inet_pton" fails on s390 due to missing inline
2018-05-20 11:18:42 -07:00
Linus Torvalds
583dbad340 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:

 - Unbreak the BPF compilation which got broken by the unconditional
   requirement of asm-goto, which is not supported by clang.

 - Prevent probing on exception masking instructions in uprobes and
   kprobes to avoid the issues of the delayed exceptions instead of
   having an ugly workaround.

 - Prevent a double free_page() in the error path of do_kexec_load()

 - A set of objtool updates addressing various issues mostly related to
   switch tables and the noreturn detection for recursive sibling calls

 - Header sync for tools.

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Detect RIP-relative switch table references, part 2
  objtool: Detect RIP-relative switch table references
  objtool: Support GCC 8 switch tables
  objtool: Support GCC 8's cold subfunctions
  objtool: Fix "noreturn" detection for recursive sibling calls
  objtool, kprobes/x86: Sync the latest <asm/insn.h> header with tools/objtool/arch/x86/include/asm/insn.h
  x86/cpufeature: Guard asm_volatile_goto usage for BPF compilation
  uprobes/x86: Prohibit probing on MOV SS instruction
  kprobes/x86: Prohibit probing on exception masking instructions
  x86/kexec: Avoid double free_page() upon do_kexec_load() failure
2018-05-20 10:01:38 -07:00
Martin Kelly
55dda0abcf tools: iio: iio_generic_buffer: allow continuous looping
Sometimes it's useful to stream samples forever, such as when
stress-testing a driver overnight to check for memory leaks or other
issues. When the program receives a signal, it will gracefully cleanup,
so it is still safe to terminate at any time.

Add support for specifying a negative -c option, meaning that we should
loop forever. To do so, we need to use a long long (instead of just
long) for num_loops so that current code specifying num_loops greater
than UNSIGNED_LONG_MAX doesn't break.

Signed-off-by: Martin Kelly <mkelly@xevo.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2018-05-20 14:55:58 +01:00
Martin Kelly
71b52d2c74 tools: iio: iio_generic_buffer: fix types to match
Several types are mismatched and causing implicit conversions.  Fix them
up so the types match.

Signed-off-by: Martin Kelly <mkelly@xevo.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2018-05-20 14:54:39 +01:00
Adrian Hunter
19422a9f2a perf tools: Fix kernel_start for PTI on x86
Opickn x86_64, PTI entry trampolines are less than the start of kernel text,
but still above 2^63. So leave kernel_start = 1ULL << 63 for x86_64.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526548928-20790-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19 06:42:51 -03:00
Adrian Hunter
dbbd34a666 perf machine: Add machine__is() to identify machine arch
Add a function to identify the machine architecture.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/1526548928-20790-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19 06:42:50 -03:00
Arnaldo Carvalho de Melo
cfc4033be7 perf bpf: Fixup include and examples install messages
Before:

  INSTALL  lib
install include/bpf/*.h '/home/acme/lib/include/perf/bpf'
  INSTALL  lib
install examples/bpf/*.c '/home/acme/lib/examples/perf/bpf'

After:

  INSTALL  lib
  INSTALL  include/bpf
  INSTALL  lib
  INSTALL  examples/bpf

Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: dd8e4ead6e ("perf bpf: Add bpf.h to be used in eBPF proggies")
Fixes: 8f12a2ff00 ("perf bpf: Add 'examples' directories")
Link: https://lkml.kernel.org/n/tip-icljqe87e8pak8mu6mkki9d4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19 06:42:50 -03:00
Jin Yao
3e71fc0319 perf annotate: Create hotkey 'c' to show min/max cycles
In the 'perf annotate' view, a new hotkey 'c' is created for showing the
min/max cycles.

For example, when press 'c', the annotate view is:

  Percent│ IPC     Cycle(min/max)
         │
         │
         │                             Disassembly of section .text:
         │
         │                             000000000003aab0 <random@@GLIBC_2.2.5>:
    8.22 │3.92                           sub    $0x18,%rsp
         │3.92                           mov    $0x1,%esi
         │3.92                           xor    %eax,%eax
         │3.92                           cmpl   $0x0,argp_program_version_hook@@G
         │3.92             1(2/1)      ↓ je     20
         │                               lock   cmpxchg %esi,__abort_msg@@GLIBC_P
         │                             ↓ jne    29
         │                             ↓ jmp    43
         │1.10                     20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+
    8.93 │1.10             1(5/1)      ↓ je     43

When press 'c' again, the annotate view is switched back:

  Percent│ IPC Cycle
         │
         │
         │                Disassembly of section .text:
         │
         │                000000000003aab0 <random@@GLIBC_2.2.5>:
    8.22 │3.92              sub    $0x18,%rsp
         │3.92              mov    $0x1,%esi
         │3.92              xor    %eax,%eax
         │3.92              cmpl   $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x
         │3.92     1      ↓ je     20
         │                  lock   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
         │                ↓ jne    29
         │                ↓ jmp    43
         │1.10        20:   cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0
    8.93 │1.10     1      ↓ je     43

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526569118-14217-3-git-send-email-yao.jin@linux.intel.com
[ Rename all maxmin to minmax ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19 06:42:49 -03:00
Josh Poimboeuf
7dec80ccbe objtool: Detect RIP-relative switch table references, part 2
With the following commit:

  fd35c88b74 ("objtool: Support GCC 8 switch tables")

I added a "can't find switch jump table" warning, to stop covering up
silent failures if add_switch_table() can't find anything.

That warning found yet another bug in the objtool switch table detection
logic.  For cases 1 and 2 (as described in the comments of
find_switch_table()), the find_symbol_containing() check doesn't adjust
the offset for RIP-relative switch jumps.

Incidentally, this bug was already fixed for case 3 with:

  6f5ec2993b ("objtool: Detect RIP-relative switch table references")

However, that commit missed the fix for cases 1 and 2.

The different cases are now starting to look more and more alike.  So
fix the bug by consolidating them into a single case, by checking the
original dynamic jump instruction in the case 3 loop.

This also simplifies the code and makes it more robust against future
switch table detection issues -- of which I'm sure there will be many...

Switch table detection has been the most fragile area of objtool, by
far.  I long for the day when we'll have a GCC plugin for annotating
switch tables.  Linus asked me to delay such a plugin due to the
flakiness of the plugin infrastructure in older versions of GCC, so this
rickety code is what we're stuck with for now.  At least the code is now
a little simpler than it was.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f400541613d45689086329432f3095119ffbc328.1526674218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-19 08:10:04 +02:00
Ross Zwisler
fd8f58c40b radix tree test suite: multi-order iteration race
Add a test which shows a race in the multi-order iteration code.  This
test reliably hits the race in under a second on my machine, and is the
result of a real bug report against kernel a production v4.15 based
kernel (4.15.6-300.fc27.x86_64).  With a real kernel this issue is hit
when using order 9 PMD DAX radix tree entries.

The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree.  Remember that an order 2
entry looks like this:

  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]

where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'

When we delete 'entry' from the tree, we call :

  radix_tree_delete()
    radix_tree_delete_item()
      __radix_tree_delete()
        replace_slot()

replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL.  This means that for a
brief period of time we end up with one or more of the siblings removed,
so:

  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]

This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection.  This is a common case in
mm/filemap.c.

The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:

                                      V preceding slot
  struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
                                              ^ current slot

This lets you find the first sibling, and you skip them all in order.

But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:

                                             V preceding slot
  struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
                                                    ^ current slot

This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers.  This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.

In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.

In the radix tree test suite this will be caught by the address
sanitizer:

  ==27063==ERROR: AddressSanitizer: heap-buffer-overflow on address
  0x60c0008ae400 at pc 0x00000040ce4f bp 0x7fa89b8fcad0 sp 0x7fa89b8fcac0
  READ of size 8 at 0x60c0008ae400 thread T3
      #0 0x40ce4e in __radix_tree_next_slot /home/rzwisler/project/linux/tools/testing/radix-tree/radix-tree.c:1660
      #1 0x4022cc in radix_tree_next_slot linux/../../../../include/linux/radix-tree.h:567
      #2 0x4022cc in iterator_func /home/rzwisler/project/linux/tools/testing/radix-tree/multiorder.c:655
      #3 0x7fa8a088d50a in start_thread (/lib64/libpthread.so.0+0x750a)
      #4 0x7fa8a03bd16e in clone (/lib64/libc.so.6+0xf516e)

Link: http://lkml.kernel.org/r/20180503192430.7582-5-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
Ross Zwisler
3e252fa7d4 radix tree test suite: add item_delete_rcu()
Currently the lifetime of "struct item" entries in the radix tree are
not controlled by RCU, but are instead deleted inline as they are
removed from the tree.

In the following patches we add a test which has threads iterating over
items pulled from the tree and verifying them in an
rcu_read_lock()/rcu_read_unlock() section.  This means that though an
item has been removed from the tree it could still be being worked on by
other threads until the RCU grace period expires.  So, we need to
actually free the "struct item" structures at the end of the grace
period, just as we do with "struct radix_tree_node" items.

Link: http://lkml.kernel.org/r/20180503192430.7582-4-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
Ross Zwisler
dcbbf25adb radix tree test suite: fix compilation issue
Pulled from a patch from Matthew Wilcox entitled "xarray: Add definition
of struct xarray":

> From: Matthew Wilcox <mawilcox@microsoft.com>
> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

  https://patchwork.kernel.org/patch/10341249/

These defines fix this compilation error:

  In file included from ./linux/radix-tree.h:6:0,
                   from ./linux/../../../../include/linux/idr.h:15,
                   from ./linux/idr.h:1,
                   from idr.c:4:
  ./linux/../../../../include/linux/idr.h: In function `idr_init_base':
  ./linux/../../../../include/linux/radix-tree.h:129:2: warning: implicit declaration of function `spin_lock_init'; did you mean `spinlock_t'? [-Wimplicit-function-declaration]
    spin_lock_init(&(root)->xa_lock);    \
    ^
  ./linux/../../../../include/linux/idr.h:126:2: note: in expansion of macro `INIT_RADIX_TREE'
    INIT_RADIX_TREE(&idr->idr_rt, IDR_RT_MARKER);
    ^~~~~~~~~~~~~~~

by providing a spin_lock_init() wrapper for the v4.17-rc* version of the
radix tree test suite.

Link: http://lkml.kernel.org/r/20180503192430.7582-3-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
Ross Zwisler
8d9fa88edd radix tree test suite: fix mapshift build target
Commit c6ce3e2fe3 ("radix tree test suite: Add config option for map
shift") introduced a phony makefile target called 'mapshift' that ends
up generating the file generated/map-shift.h.  This phony target was
then added as a dependency of the top level 'targets' build target,
which is what is run when you go to tools/testing/radix-tree and just
type 'make'.

Unfortunately, this phony target doesn't actually work as a dependency,
so you end up getting:

  $ make
  make: *** No rule to make target 'generated/map-shift.h', needed by 'main.o'.  Stop.
  make: *** Waiting for unfinished jobs....

Fix this by making the file generated/map-shift.h our real makefile
target, and add this a dependency of the top level build target.

Link: http://lkml.kernel.org/r/20180503192430.7582-2-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: CR, Sapthagirish <sapthagirish.cr@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-18 17:17:12 -07:00
John Fastabend
4da0dcabe4 bpf: add sk_msg prog sk access tests to test_verifier
Add tests for BPF_PROG_TYPE_SK_MSG to test_verifier for read access
to new sk fields.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 22:44:11 +02:00
Anders Roxell
a6837d2667 selftests: bpf: config: enable NET_SCH_INGRESS for xdp_meta.sh
When running bpf's selftest test_xdp_meta.sh it fails:
./test_xdp_meta.sh
Error: Specified qdisc not found.
selftests: test_xdp_meta [FAILED]

Need to enable CONFIG_NET_SCH_INGRESS and CONFIG_NET_CLS_ACT to get the
test to pass.

Fixes: 22c8852624 ("bpf: improve selftests and add tests for meta pointer")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-18 21:39:39 +02:00
Jin Yao
48659ebf37 perf annotate: Record the min/max cycles
Currently perf has a feature to account cycles for LBRs

For example, on skylake:

  perf record -b ...
  perf report or perf annotate

And then browsing the annotate browser gives average cycle counts for
program blocks.

For some analysis it would be useful if we could know not only the
average cycles but also the min and max cycles.

This patch records the min and max cycles.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1526569118-14217-2-git-send-email-yao.jin@linux.intel.com
[ Switch from max/min to min/max ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-18 16:31:41 -03:00
Sandipan Das
7903a70867 perf script: Show symbol offsets by default
Since the ip shown for a symbol is now always a virtual address, it
becomes difficult to correlate this with objdump output and determine
the exact instruction address. So, we always show the offset from the
start of the symbol.

This can be verified on a powerpc64le system running Fedora 27 as
follows:

  # perf probe -a sys_write
  # perf record -e probe:sys_write -g ~/test

Before applying this patch:

  # perf script

  test  9710 [013] 95614.332431: probe:sys_write: (c0000000004025b0)
          c0000000004025b0 sys_write (/lib/modules/4.17.0-rc4+/build/vmlinux)
          c00000000000b9e0 system_call (/lib/modules/4.17.0-rc4+/build/vmlinux)
              7fffb70d8234 __GI___libc_write (/usr/lib64/libc-2.26.so)
              7fffb7052c74 _IO_file_write@@GLIBC_2.17 (/usr/lib64/libc-2.26.so)
                  5afc1818 [unknown] ([unknown])
              7fffb7051a60 new_do_write (/usr/lib64/libc-2.26.so)
              7fffb7054638 _IO_do_write@@GLIBC_2.17 (/usr/lib64/libc-2.26.so)
              7fffb7054bbc _IO_file_overflow@@GLIBC_2.17 (/usr/lib64/libc-2.26.so)
              7fffb7055a24 __overflow (/usr/lib64/libc-2.26.so)
              7fffb7044548 _IO_puts (/usr/lib64/libc-2.26.so)
                  10000440 main (/home/sandipan/test)
              7fffb6fe36a0 generic_start_main.isra.0 (/usr/lib64/libc-2.26.so)
              7fffb6fe3898 __libc_start_main (/usr/lib64/libc-2.26.so)
                         0 [unknown] ([unknown])
  ...

After applying this patch:

  # perf script

  test  9710 [013] 95614.332431: probe:sys_write: (c0000000004025b0)
          c0000000004025b0 sys_write+0x10 (/lib/modules/4.17.0-rc4+/build/vmlinux)
          c00000000000b9e0 system_call+0x58 (/lib/modules/4.17.0-rc4+/build/vmlinux)
              7fffb70d8234 __GI___libc_write+0x24 (/usr/lib64/libc-2.26.so)
              7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44 (/usr/lib64/libc-2.26.so)
                  5afc1818 [unknown] ([unknown])
              7fffb7051a60 new_do_write+0x90 (/usr/lib64/libc-2.26.so)
              7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38 (/usr/lib64/libc-2.26.so)
              7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c (/usr/lib64/libc-2.26.so)
              7fffb7055a24 __overflow+0x64 (/usr/lib64/libc-2.26.so)
              7fffb7044548 _IO_puts+0x218 (/usr/lib64/libc-2.26.so)
                  10000440 main+0x20 (/home/sandipan/test)
              7fffb6fe36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
              7fffb6fe3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
                         0 [unknown] ([unknown])
  ...

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lkml.kernel.org/r/20180517063326.6319-2-sandipan@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-18 16:31:40 -03:00