Commit Graph

22212 Commits

Author SHA1 Message Date
David S. Miller
a57066b1a0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The UDP reuseport conflict was a little bit tricky.

The net-next code, via bpf-next, extracted the reuseport handling
into a helper so that the BPF sk lookup code could invoke it.

At the same time, the logic for reuseport handling of unconnected
sockets changed via commit efc6b6f6c3
which changed the logic to carry on the reuseport result into the
rest of the lookup loop if we do not return immediately.

This requires moving the reuseport_has_conns() logic into the callers.

While we are here, get rid of inline directives as they do not belong
in foo.c files.

The other changes were cases of more straightforward overlapping
modifications.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25 17:49:04 -07:00
Linus Torvalds
1b64b2e244 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into master
Pull networking fixes from David Miller:

 1) Fix RCU locaking in iwlwifi, from Johannes Berg.

 2) mt76 can access uninitialized NAPI struct, from Felix Fietkau.

 3) Fix race in updating pause settings in bnxt_en, from Vasundhara
    Volam.

 4) Propagate error return properly during unbind failures in ax88172a,
    from George Kennedy.

 5) Fix memleak in adf7242_probe, from Liu Jian.

 6) smc_drv_probe() can leak, from Wang Hai.

 7) Don't muck with the carrier state if register_netdevice() fails in
    the bonding driver, from Taehee Yoo.

 8) Fix memleak in dpaa_eth_probe, from Liu Jian.

 9) Need to check skb_put_padto() return value in hsr_fill_tag(), from
    Murali Karicheri.

10) Don't lose ionic RSS hash settings across FW update, from Shannon
    Nelson.

11) Fix clobbered SKB control block in act_ct, from Wen Xu.

12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang.

13) IS_UDPLITE cleanup a long time ago, incorrectly handled
    transformations involving UDPLITE_RECV_CC. From Miaohe Lin.

14) Unbalanced locking in netdevsim, from Taehee Yoo.

15) Suppress false-positive error messages in qed driver, from Alexander
    Lobakin.

16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye.

17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost.

18) Uninitialized value in geneve_changelink(), from Cong Wang.

19) Fix deadlock in xen-netfront, from Andera Righi.

19) flush_backlog() frees skbs with IRQs disabled, so should use
    dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov
    Kasiviswanathan.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
  drivers/net/wan: lapb: Corrected the usage of skb_cow
  dev: Defer free of skbs in flush_backlog
  qrtr: orphan socket in qrtr_release()
  xen-netfront: fix potential deadlock in xennet_remove()
  flow_offload: Move rhashtable inclusion to the source file
  geneve: fix an uninitialized value in geneve_changelink()
  bonding: check return value of register_netdevice() in bond_newlink()
  tcp: allow at most one TLP probe per flight
  AX.25: Prevent integer overflows in connect and sendmsg
  cxgb4: add missing release on skb in uld_send()
  net: atlantic: fix PTP on AQC10X
  AX.25: Prevent out-of-bounds read in ax25_sendmsg()
  sctp: shrink stream outq when fails to do addstream reconf
  sctp: shrink stream outq only when new outcnt < old outcnt
  AX.25: Fix out-of-bounds read in ax25_connect()
  enetc: Remove the mdio bus on PF probe bailout
  net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
  net: ethernet: ave: Fix error returns in ave_init
  drivers/net/wan/x25_asy: Fix to make it work
  ipvs: fix the connection sync failed in some cases
  ...
2020-07-25 11:50:59 -07:00
David Ahern
1b6687e31a vrf: Handle CONFIG_SYSCTL not set
Randy reported compile failure when CONFIG_SYSCTL is not set/enabled:

ERROR: modpost: "sysctl_vals" [drivers/net/vrf.ko] undefined!

Fix by splitting out the sysctl init and cleanup into helpers that
can be set to do nothing when CONFIG_SYSCTL is disabled. In addition,
move vrf_strict_mode and vrf_strict_mode_change to above
vrf_shared_table_handler (code move only) and wrap all of it
in the ifdef CONFIG_SYSCTL.

Update the strict mode tests to check for the existence of the
/proc/sys entry.

Fixes: 33306f1aaf ("vrf: add sysctl parameter for strict mode")
Cc: Andrea Mayer <andrea.mayer@uniroma2.it>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23 17:51:04 -07:00
David S. Miller
dee72f8a0c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-21

The following pull-request contains BPF updates for your *net-next* tree.

We've added 46 non-merge commits during the last 6 day(s) which contain
a total of 68 files changed, 4929 insertions(+), 526 deletions(-).

The main changes are:

1) Run BPF program on socket lookup, from Jakub.

2) Introduce cpumap, from Lorenzo.

3) s390 JIT fixes, from Ilya.

4) teach riscv JIT to emit compressed insns, from Luke.

5) use build time computed BTF ids in bpf iter, from Yonghong.
====================

Purely independent overlapping changes in both filter.h and xdp.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22 12:35:33 -07:00
Paolo Pisati
b346c0c858 selftest: txtimestamp: fix net ns entry logic
According to 'man 8 ip-netns', if `ip netns identify` returns an empty string,
there's no net namespace associated with current PID: fix the net ns entrance
logic.

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21 16:11:07 -07:00
Tony Ambardar
9165e1d70f bpftool: Use only nftw for file tree parsing
The bpftool sources include code to walk file trees, but use multiple
frameworks to do so: nftw and fts. While nftw conforms to POSIX/SUSv3 and
is widely available, fts is not conformant and less common, especially on
non-glibc systems. The inconsistent framework usage hampers maintenance
and portability of bpftool, in particular for embedded systems.

Standardize code usage by rewriting one fts-based function to use nftw and
clean up some related function warnings by extending use of "const char *"
arguments. This change helps in building bpftool against musl for OpenWrt.

Also fix an unsafe call to dirname() by duplicating the string to pass,
since some implementations may directly alter it. The same approach is
used in libbpf.c.

Signed-off-by: Tony Ambardar <Tony.Ambardar@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200721024817.13701-1-Tony.Ambardar@gmail.com
2020-07-21 23:42:56 +02:00
Yonghong Song
fce557bcef bpf: Make btf_sock_ids global
tcp and udp bpf_iter can reuse some socket ids in
btf_sock_ids, so make it global.

I put the extern definition in btf_ids.h as a central
place so it can be easily discovered by developers.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163402.1393427-1-yhs@fb.com
2020-07-21 13:26:26 -07:00
Yonghong Song
0f12e584b2 bpf: Add BTF_ID_LIST_GLOBAL in btf_ids.h
Existing BTF_ID_LIST used a local static variable
to store btf_ids. This patch provided a new macro
BTF_ID_LIST_GLOBAL to store btf_ids in a global
variable which can be shared among multiple files.

The existing BTF_ID_LIST is still retained.
Two reasons. First, BTF_ID_LIST is also used to build
btf_ids for helper arguments which typically
is an array of 5. Since typically different
helpers have different signature, it makes
little sense to share them. Second, some
current computed btf_ids are indeed local.
If later those btf_ids are shared between
different files, they can use BTF_ID_LIST_GLOBAL then.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/bpf/20200720163401.1393159-1-yhs@fb.com
2020-07-21 13:26:26 -07:00
Yonghong Song
d8dfe5bfe8 tools/bpf: Sync btf_ids.h to tools
Sync kernel header btf_ids.h to tools directory.
Also define macro CONFIG_DEBUG_INFO_BTF before
including btf_ids.h in prog_tests/resolve_btfids.c
since non-stub definitions for BTF_ID_LIST etc. macros
are defined under CONFIG_DEBUG_INFO_BTF. This
prevented test_progs from failing.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720163359.1393079-1-yhs@fb.com
2020-07-21 13:26:26 -07:00
Ilya Leoshkevich
e4d9c23207 samples/bpf, selftests/bpf: Use bpf_probe_read_kernel
A handful of samples and selftests fail to build on s390, because
after commit 0ebeea8ca8 ("bpf: Restrict bpf_probe_read{, str}()
only to archs where they work") bpf_probe_read is not available
anymore.

Fix by using bpf_probe_read_kernel.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720114806.88823-1-iii@linux.ibm.com
2020-07-21 13:26:26 -07:00
Ilya Leoshkevich
6bd557275a selftests/bpf: Fix test_lwt_seg6local.sh hangs
OpenBSD netcat (Debian patchlevel 1.195-2) does not seem to react to
SIGINT for whatever reason, causing prefix.pl to hang after
test_lwt_seg6local.sh exits due to netcat inheriting
test_lwt_seg6local.sh's file descriptors.

Fix by using SIGTERM instead.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200720101810.84299-1-iii@linux.ibm.com
2020-07-21 13:26:26 -07:00
YueHaibing
956fcfcd35 tools/bpftool: Fix error handing in do_skeleton()
Fix pass 0 to PTR_ERR, also dump more err info using
libbpf_strerror.

Fixes: 5dc7a8b211 ("bpftool, selftests/bpf: Embed object file inside skeleton")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200717123059.29624-1-yuehaibing@huawei.com
2020-07-21 13:26:25 -07:00
Ian Rogers
da7a35062b libbpf bpf_helpers: Use __builtin_offsetof for offsetof
The non-builtin route for offsetof has a dependency on size_t from
stdlib.h/stdint.h that is undeclared and may break targets.
The offsetof macro in bpf_helpers may disable the same macro in other
headers that have a #ifdef offsetof guard. Rather than add additional
dependencies improve the offsetof macro declared here to use the
builtin that is available since llvm 3.7 (the first with a BPF backend).

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200720061741.1514673-1-irogers@google.com
2020-07-21 13:26:25 -07:00
Ilya Leoshkevich
2ea4859807 selftests: bpf: test_kmod.sh: Fix running out of srctree
When running out of srctree, relative path to lib/test_bpf.ko is
different than when running in srctree. Check $building_out_of_srctree
environment variable and use a different relative path if needed.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717165326.6786-2-iii@linux.ibm.com
2020-07-21 13:26:24 -07:00
Thomas Richter
3d3af181d3 s390/cpum_cf,perf: change DFLT_CCERROR counter name
Change the counter name DLFT_CCERROR to DLFT_CCFINISH on IBM z15.
This counter counts completed DEFLATE instructions with exit code
0, 1 or 2. Since exit code 0 means success and exit code 1 or 2
indicate errors, change the counter name to avoid confusion.
This counter is incremented each time the DEFLATE instruction
completed regardless if an error was detected or not.

Fixes: d68d5d51dc ("s390/cpum_cf: Add new extended counters for IBM z15")
Fixes: e7950166e4 ("perf vendor events s390: Add new deflate counters for IBM z15")
Cc: stable@vger.kernel.org # v5.7
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-21 13:53:56 +02:00
Briana Oursler
2b9843fbe1 tc-testing: Add tdc to kselftests
Add tdc to existing kselftest infrastructure so that it can be run with
existing kselftests. TDC now generates objects in objdir/kselftest
without cluttering main objdir, leaves source directory clean, and
installs correctly in kselftest_install, properly adding itself to
run_kselftest.sh script.

Add tc-testing as a target of selftests/Makefile. Create tdc.sh to run
tdc.py targets with correct arguments. To support single target from
selftest/Makefile, combine tc-testing/bpf/Makefile and
tc-testing/Makefile. Move action.c up a directory to tc-testing/.

Tested with:
 make O=/tmp/{objdir} TARGETS="tc-testing" kselftest
 cd /tmp/{objdir}
 cd kselftest
 cd tc-testing
 ./tdc.sh

 make -C tools/testing/selftests/ TARGETS=tc-testing run_tests

 make TARGETS="tc-testing" kselftest
 cd tools/testing/selftests
 ./kselftest_install.sh /tmp/exampledir
 My VM doesn't run all the kselftests so I commented out all except my
 target and net/pmtu.sh then:
 cd /tmp/exampledir && ./run_kselftest.sh

Co-developed-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Briana Oursler <briana.oursler@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:29:37 -07:00
Vladimir Oltean
7570ebe041 testptp: add new options for perout phase and pulse width
Extend the example program for PTP ancillary functionality with the
ability to configure not only the periodic output's period (frequency),
but also the phase and duty cycle (pulse width) which were newly
introduced.

The ioctl level also needs to be updated to the new PTP_PEROUT_REQUEST2,
since the original PTP_PEROUT_REQUEST doesn't support this
functionality. For an in-tree testing program, not having explicit
backwards compatibility is fine, as it should always be tested with the
current kernel headers and sources.

Tested with an oscilloscope on the felix switch PHC:

echo '2 0' > /sys/class/ptp/ptp1/pins/switch_1588_dat0
./testptp -d /dev/ptp1 -p 1000000000 -w 100000000 -H 1000 -i 0

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:04:59 -07:00
Vladimir Oltean
4a09a98100 testptp: promote 'perout' variable to int64_t
Since 'perout' holds the nanosecond value of the signal's period, it
should be a 64-bit value. Current assumption is that it cannot be larger
than 1 second.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-20 18:04:59 -07:00
Christoph Hellwig
55db9c0e85 net: remove compat_sys_{get,set}sockopt
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.

This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.

It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-19 18:16:40 -07:00
Linus Torvalds
92188b41f1 Merge tag 'perf-tools-fixes-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into master
Pull perf tooling fixes from Arnaldo Carvalho de Melo:

 - Update hashmap.h from libbpf and kvm.h from x86's kernel UAPI.

 - Set opt->set in libsubcmd's OPT_CALLBACK_SET(). This fixes
   'perf record --switch-output-event event-name' usage"

* tag 'perf-tools-fixes-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  tools arch kvm: Sync kvm headers with the kernel sources
  perf tools: Sync hashmap.h with libbpf's
  libsubcmd: Fix OPT_CALLBACK_SET()
2020-07-19 12:35:07 -07:00
Linus Torvalds
721db9dfb1 Merge tag 'powerpc-5.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into master
Pull powerpc fixes from Michael Ellerman:
 "Some more powerpc fixes for 5.8:

   - A fix to the VAS code we merged this cycle, to report the proper
     error code to userspace for address translation failures. And a
     selftest update to match.

   - Another fix for our pkey handling of PROT_EXEC mappings.

   - A fix for a crash when booting a "secure VM" under an ultravisor
     with certain numbers of CPUs.

  Thanks to: Aneesh Kumar K.V, Haren Myneni, Laurent Dufour, Sandipan
  Das, Satheesh Rajendran, Thiago Jung Bauermann"

* tag 'powerpc-5.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  selftests/powerpc: Use proper error code to check fault address
  powerpc/vas: Report proper error code for address translation failure
  powerpc/pseries/svm: Fix incorrect check for shared_lppaca_size
  powerpc/book3s64/pkeys: Fix pkey_access_permitted() for execute disable pkey
2020-07-18 10:45:17 -07:00
Jakub Sitnicki
0ab5539f85 selftests/bpf: Tests for BPF_SK_LOOKUP attach point
Add tests to test_progs that exercise:

 - attaching/detaching/querying programs to BPF_SK_LOOKUP hook,
 - redirecting socket lookup to a socket selected by BPF program,
 - failing a socket lookup on BPF program's request,
 - error scenarios for selecting a socket from BPF program,
 - accessing BPF program context,
 - attaching and running multiple BPF programs.

Run log:

  bash-5.0# ./test_progs -n 70
  #70/1 query lookup prog:OK
  #70/2 TCP IPv4 redir port:OK
  #70/3 TCP IPv4 redir addr:OK
  #70/4 TCP IPv4 redir with reuseport:OK
  #70/5 TCP IPv4 redir skip reuseport:OK
  #70/6 TCP IPv6 redir port:OK
  #70/7 TCP IPv6 redir addr:OK
  #70/8 TCP IPv4->IPv6 redir port:OK
  #70/9 TCP IPv6 redir with reuseport:OK
  #70/10 TCP IPv6 redir skip reuseport:OK
  #70/11 UDP IPv4 redir port:OK
  #70/12 UDP IPv4 redir addr:OK
  #70/13 UDP IPv4 redir with reuseport:OK
  #70/14 UDP IPv4 redir skip reuseport:OK
  #70/15 UDP IPv6 redir port:OK
  #70/16 UDP IPv6 redir addr:OK
  #70/17 UDP IPv4->IPv6 redir port:OK
  #70/18 UDP IPv6 redir and reuseport:OK
  #70/19 UDP IPv6 redir skip reuseport:OK
  #70/20 TCP IPv4 drop on lookup:OK
  #70/21 TCP IPv6 drop on lookup:OK
  #70/22 UDP IPv4 drop on lookup:OK
  #70/23 UDP IPv6 drop on lookup:OK
  #70/24 TCP IPv4 drop on reuseport:OK
  #70/25 TCP IPv6 drop on reuseport:OK
  #70/26 UDP IPv4 drop on reuseport:OK
  #70/27 TCP IPv6 drop on reuseport:OK
  #70/28 sk_assign returns EEXIST:OK
  #70/29 sk_assign honors F_REPLACE:OK
  #70/30 sk_assign accepts NULL socket:OK
  #70/31 access ctx->sk:OK
  #70/32 narrow access to ctx v4:OK
  #70/33 narrow access to ctx v6:OK
  #70/34 sk_assign rejects TCP established:OK
  #70/35 sk_assign rejects UDP connected:OK
  #70/36 multi prog - pass, pass:OK
  #70/37 multi prog - drop, drop:OK
  #70/38 multi prog - pass, drop:OK
  #70/39 multi prog - drop, pass:OK
  #70/40 multi prog - pass, redir:OK
  #70/41 multi prog - redir, pass:OK
  #70/42 multi prog - drop, redir:OK
  #70/43 multi prog - redir, drop:OK
  #70/44 multi prog - redir, redir:OK
  #70 sk_lookup:OK
  Summary: 1/44 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-16-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
f7726cbea4 selftests/bpf: Add verifier tests for bpf_sk_lookup context access
Exercise verifier access checks for bpf_sk_lookup context fields.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-15-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
93a3545d81 tools/bpftool: Add name mappings for SK_LOOKUP prog and attach type
Make bpftool show human-friendly identifiers for newly introduced program
and attach type, BPF_PROG_TYPE_SK_LOOKUP and BPF_SK_LOOKUP, respectively.

Also, add the new prog type bash-completion, man page and help message.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200717103536.397595-14-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
499dd29d90 libbpf: Add support for SK_LOOKUP program type
Make libbpf aware of the newly added program type, and assign it a
section name.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-13-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Jakub Sitnicki
a352b32ae9 bpf: Sync linux/bpf.h to tools/
Newly added program, context type and helper is used by tests in a
subsequent patch. Synchronize the header file.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200717103536.397595-12-jakub@cloudflare.com
2020-07-17 20:18:17 -07:00
Paolo Pisati
aba69d49fb selftests: net: ip_defrag: modprobe missing nf_defrag_ipv6 support
Fix ip_defrag.sh when CONFIG_NF_DEFRAG_IPV6=m:

$ sudo ./ip_defrag.sh
+ set -e
+ mktemp -u XXXXXX
+ readonly NETNS=ns-rGlXcw
+ trap cleanup EXIT
+ setup
+ ip netns add ns-rGlXcw
+ ip -netns ns-rGlXcw link set lo up
+ ip netns exec ns-rGlXcw sysctl -w net.ipv4.ipfrag_high_thresh=9000000
+ ip netns exec ns-rGlXcw sysctl -w net.ipv4.ipfrag_low_thresh=7000000
+ ip netns exec ns-rGlXcw sysctl -w net.ipv4.ipfrag_time=1
+ ip netns exec ns-rGlXcw sysctl -w net.ipv6.ip6frag_high_thresh=9000000
+ ip netns exec ns-rGlXcw sysctl -w net.ipv6.ip6frag_low_thresh=7000000
+ ip netns exec ns-rGlXcw sysctl -w net.ipv6.ip6frag_time=1
+ ip netns exec ns-rGlXcw sysctl -w net.netfilter.nf_conntrack_frag6_high_thresh=9000000
+ cleanup
+ ip netns del ns-rGlXcw

$ ls -la /proc/sys/net/netfilter/nf_conntrack_frag6_high_thresh
ls: cannot access '/proc/sys/net/netfilter/nf_conntrack_frag6_high_thresh': No such file or directory

$ sudo modprobe nf_defrag_ipv6
$ ls -la /proc/sys/net/netfilter/nf_conntrack_frag6_high_thresh
-rw-r--r-- 1 root root 0 Jul 14 12:34 /proc/sys/net/netfilter/nf_conntrack_frag6_high_thresh

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-17 12:49:18 -07:00
Arnaldo Carvalho de Melo
25d4e7f513 tools arch kvm: Sync kvm headers with the kernel sources
To pick up the changes from:

  83d31e5271 ("KVM: nVMX: fixes for preemption timer migration")

That don't entail changes in tooling.

This silences these tools/perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h'
  diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-17 09:39:16 -03:00
Arnaldo Carvalho de Melo
94fddb7ad0 perf tools: Sync hashmap.h with libbpf's
To pick up the changes in:

  b2f9f1535b ("libbpf: Fix libbpf hashmap on (I)LP32 architectures")

Silencing this warning:

  Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
  diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h

I'll eventually update the warning to remove the "Kernel ABI" part
and instead state libbpf when noticing that the original is at
"tools/lib/something".

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Jakub Bogusz <qboosh@pld-linux.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-17 09:35:18 -03:00
Ravi Bangoria
a2db71b912 libsubcmd: Fix OPT_CALLBACK_SET()
Any option macro with _SET suffix should set opt->set variable which is
not happening for OPT_CALLBACK_SET(). This is causing issues with perf
record --switch-output-event. Fix that.

Before:

  # ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
           --switch-output-event syscalls:sys_enter_mmap
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.297 MB perf.data (657 samples) ]

After:

  $ ./perf record --overwrite -e sched:*switch,syscalls:sys_enter_mmap \
          --switch-output-event syscalls:sys_enter_mmap
  [ perf record: dump data: Woken up 1 times ]
  [ perf record: Dump perf.data.2020061918144542 ]
  [ perf record: dump data: Woken up 1 times ]
  [ perf record: Dump perf.data.2020061918144608 ]
  [ perf record: dump data: Woken up 1 times ]
  [ perf record: Dump perf.data.2020061918144660 ]
  ^C[ perf record: dump data: Woken up 1 times ]
  [ perf record: Dump perf.data.2020061918144784 ]
  [ perf record: Woken up 0 times to write data ]
  [ perf record: Dump perf.data.2020061918144803 ]
  [ perf record: Captured and wrote 0.419 MB perf.data.<timestamp> ]

Fixes: 636eb4d001 ("libsubcmd: Introduce OPT_CALLBACK_SET()")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200619133412.50705-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-07-17 09:33:06 -03:00
Randy Dunlap
bfdfa51702 bpf: Drop duplicated words in uapi helper comments
Drop doubled words "will" and "attach".

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/6b9f71ae-4f8e-0259-2c5d-187ddaefe6eb@infradead.org
2020-07-16 21:00:09 +02:00
Stanislav Fomichev
e81e7a5337 selftests/bpf: Fix possible hang in sockopt_inherit
Andrii reported that sockopt_inherit occasionally hangs up on 5.5 kernel [0].
This can happen if server_thread runs faster than the main thread.
In that case, pthread_cond_wait will wait forever because
pthread_cond_signal was executed before the main thread was blocking.
Let's move pthread_mutex_lock up a bit to make sure server_thread
runs strictly after the main thread goes to sleep.

(Not sure why this is 5.5 specific, maybe scheduling is less
deterministic? But I was able to confirm that it does indeed
happen in a VM.)

[0] https://lore.kernel.org/bpf/CAEf4BzY0-bVNHmCkMFPgObs=isUAyg-dFzGDY7QWYkmm7rmTSg@mail.gmail.com/

Reported-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200715224107.3591967-1-sdf@google.com
2020-07-16 20:57:09 +02:00
Lorenzo Bianconi
0550012502 selftest: Add tests for XDP programs in CPUMAP entries
Similar to what have been done for DEVMAP, introduce tests to verify
ability to add a XDP program to an entry in a CPUMAP.
Verify CPUMAP programs can not be attached to devices as a normal
XDP program, and only programs with BPF_XDP_CPUMAP attach type can
be loaded in a CPUMAP.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/9c632fcea5382ea7b4578bd06b6eddf382c3550b.1594734381.git.lorenzo@kernel.org
2020-07-16 17:00:32 +02:00
Lorenzo Bianconi
4be556cf5a libbpf: Add SEC name for xdp programs attached to CPUMAP
As for DEVMAP, support SEC("xdp_cpumap/") as a short cut for loading
the program with type BPF_PROG_TYPE_XDP and expected attach type
BPF_XDP_CPUMAP.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/33174c41993a6d860d9c7c1f280a2477ee39ed11.1594734381.git.lorenzo@kernel.org
2020-07-16 17:00:32 +02:00
Lorenzo Bianconi
9216477449 bpf: cpumap: Add the possibility to attach an eBPF program to cpumap
Introduce the capability to attach an eBPF program to cpumap entries.
The idea behind this feature is to add the possibility to define on
which CPU run the eBPF program if the underlying hw does not support
RSS. Current supported verdicts are XDP_DROP and XDP_PASS.

This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu
sample available in the kernel tree to identify possible performance
regressions. Results show there are no observable differences in
packet-per-second:

$./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1
rx: 354.8 Kpps
rx: 356.0 Kpps
rx: 356.8 Kpps
rx: 356.3 Kpps
rx: 356.6 Kpps
rx: 356.6 Kpps
rx: 356.7 Kpps
rx: 355.8 Kpps
rx: 356.8 Kpps
rx: 356.8 Kpps

Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/5c9febdf903d810b3415732e5cd98491d7d9067a.1594734381.git.lorenzo@kernel.org
2020-07-16 17:00:32 +02:00
Lorenzo Bianconi
644bfe51fa cpumap: Formalize map value as a named struct
As it has been already done for devmap, introduce 'struct bpf_cpumap_val'
to formalize the expected values that can be passed in for a CPUMAP.
Update cpumap code to use the struct.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/754f950674665dae6139c061d28c1d982aaf4170.1594734381.git.lorenzo@kernel.org
2020-07-16 17:00:32 +02:00
Ido Schimmel
46b171d7d7 selftests: mlxsw: Test policers' occupancy
Test that policers shared by different tc filters are correctly
reference counted by observing policers' occupancy via devlink-resource.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
5061e77326 selftests: mlxsw: Add scale test for tc-police
Query the maximum number of supported policers using devlink-resource
and test that this number can be reached by configuring tc filters with
police action. Test that an error is returned in case the maximum number
is exceeded.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
cb12d17632 selftests: mlxsw: tc_restrictions: Test tc-police restrictions
Test that upper and lower limits on rate and burst size imposed by the
device are rejected by the kernel.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Ido Schimmel
afe231d32e selftests: forwarding: Add tc-police tests
Test tc-police action in various scenarios such as Rx policing, Tx
policing, shared policer and police piped to mirred. The test passes
with both veth pairs and loopbacked ports.

# ./tc_police.sh
TEST: police on rx                                                  [ OK ]
TEST: police on tx                                                  [ OK ]
TEST: police with shared policer - rx                               [ OK ]
TEST: police with shared policer - tx                               [ OK ]
TEST: police rx and mirror                                          [ OK ]
TEST: police tx and mirror                                          [ OK ]

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-07-15 18:10:00 -07:00
Haren Myneni
f0479c4bcb selftests/powerpc: Use proper error code to check fault address
ERR_NX_TRANSLATION(CSB.CC=5) is for internal to VAS for fault handling
and should not used by OS. ERR_NX_AT_FAULT(CSB.CC=250) is the proper
error code should be reported by OS when NX encounters address
translation failure.

This patch uses CC=250 to determine the fault address when the request
is not successful.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0315251705baff94f678c33178491b5008723511.camel@linux.ibm.com
2020-07-15 23:10:17 +10:00
David S. Miller
df8201cc8b Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-14

The following pull-request contains BPF updates for your *net-next* tree.

We've added 21 non-merge commits during the last 1 day(s) which contain
a total of 20 files changed, 308 insertions(+), 279 deletions(-).

The main changes are:

1) Fix selftests/bpf build, from Alexei.

2) Fix resolve_btfids build issues, from Jiri.

3) Pull usermode-driver-cleanup set, from Eric.

4) Two minor fixes to bpfilter, from Alexei and Masahiro.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 17:00:52 -07:00
Paolo Pisati
651149f603 selftests: fib_nexthop_multiprefix: fix cleanup() netns deletion
During setup():
...
        for ns in h0 r1 h1 h2 h3
        do
                create_ns ${ns}
        done
...

while in cleanup():
...
        for n in h1 r1 h2 h3 h4
        do
                ip netns del ${n} 2>/dev/null
        done
...

and after removing the stderr redirection in cleanup():

$ sudo ./fib_nexthop_multiprefix.sh
...
TEST: IPv4: host 0 to host 3, mtu 1400                              [ OK ]
TEST: IPv6: host 0 to host 3, mtu 1400                              [ OK ]
Cannot remove namespace file "/run/netns/h4": No such file or directory
$ echo $?
1

and a non-zero return code, make kselftests fail (even if the test
itself is fine):

...
not ok 34 selftests: net: fib_nexthop_multiprefix.sh # exit=1
...

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 15:06:12 -07:00
Horatiu Vultur
ffb3adba64 net: bridge: Add port attribute IFLA_BRPORT_MRP_IN_OPEN
This patch adds a new port attribute, IFLA_BRPORT_MRP_IN_OPEN, which
allows to notify the userspace when the node lost the contiuity of
MRP_InTest frames.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 13:46:43 -07:00
Jiri Olsa
11bb2f7a45 bpf: Fix cross build for CONFIG_DEBUG_INFO_BTF option
Stephen and 0-DAY CI Kernel Test Service reported broken cross build
for arm (arm-linux-gnueabi-gcc (GCC) 9.3.0), with following output:

   /tmp/ccMS5uth.s: Assembler messages:
   /tmp/ccMS5uth.s:69: Error: unrecognized symbol type ""
   /tmp/ccMS5uth.s:82: Error: unrecognized symbol type ""

Having '@object' for .type  diretive is  wrong because '@' is comment
character for some architectures. Using STT_OBJECT instead that should
work everywhere.

Also using HOST* variables to build resolve_btfids so it's properly
build in crossbuilds (stolen from objtool's Makefile).

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/bpf/20200714102534.299280-2-jolsa@kernel.org
2020-07-14 10:20:31 -07:00
Alexei Starovoitov
f7d40ee7ef selftests/bpf: Fix merge conflict resolution
Remove double definition of structs.

Fixes: 71930d6102 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-07-13 20:59:25 -07:00
David S. Miller
07dd1b7e68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-07-13

The following pull-request contains BPF updates for your *net-next* tree.

We've added 36 non-merge commits during the last 7 day(s) which contain
a total of 62 files changed, 2242 insertions(+), 468 deletions(-).

The main changes are:

1) Avoid trace_printk warning banner by switching bpf_trace_printk to use
   its own tracing event, from Alan.

2) Better libbpf support on older kernels, from Andrii.

3) Additional AF_XDP stats, from Ciara.

4) build time resolution of BTF IDs, from Jiri.

5) BPF_CGROUP_INET_SOCK_RELEASE hook, from Stanislav.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 18:04:05 -07:00
Petr Machata
1add92121e selftests: mlxsw: RED: Test offload of mirror on RED early_drop qevent
Add a selftest for offloading a mirror action attached to the block
associated with RED early_drop qevent.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Andrii Nakryiko
0b20933d8c tools/bpftool: Strip away modifiers from global variables
Reliably remove all the type modifiers from read-only (.rodata) global
variable definitions, including cases of inner field const modifiers and
arrays of const values.

Also modify one of selftests to ensure that const volatile struct doesn't
prevent user-space from modifying .rodata variable.

Fixes: 985ead416d ("bpftool: Add skeleton codegen command")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200713232409.3062144-3-andriin@fb.com
2020-07-13 17:07:43 -07:00
Andrii Nakryiko
7c819e7013 libbpf: Support stripping modifiers for btf_dump
One important use case when emitting const/volatile/restrict is undesirable is
BPF skeleton generation of DATASEC layout. These are further memory-mapped and
can be written/read from user-space directly.

For important case of .rodata variables, bpftool strips away first-level
modifiers, to make their use on user-space side simple and not requiring extra
type casts to override compiler complaining about writing to const variables.

This logic works mostly fine, but breaks in some more complicated cases. E.g.:

    const volatile int params[10];

Because in BTF it's a chain of ARRAY -> CONST -> VOLATILE -> INT, bpftool
stops at ARRAY and doesn't strip CONST and VOLATILE. In skeleton this variable
will be emitted as is. So when used from user-space, compiler will complain
about writing to const array. This is problematic, as also mentioned in [0].

To solve this for arrays and other non-trivial cases (e.g., inner
const/volatile fields inside the struct), teach btf_dump to strip away any
modifier, when requested. This is done as an extra option on
btf_dump__emit_type_decl() API.

Reported-by: Anton Protopopov <a.s.protopopov@gmail.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200713232409.3062144-2-andriin@fb.com
2020-07-13 17:07:43 -07:00