Commit Graph

19992 Commits

Author SHA1 Message Date
Jason A. Donenfeld
e7096c131e net: WireGuard secure network tunnel
WireGuard is a layer 3 secure networking tunnel made specifically for
the kernel, that aims to be much simpler and easier to audit than IPsec.
Extensive documentation and description of the protocol and
considerations, along with formal proofs of the cryptography, are
available at:

  * https://www.wireguard.com/
  * https://www.wireguard.com/papers/wireguard.pdf

This commit implements WireGuard as a simple network device driver,
accessible in the usual RTNL way used by virtual network drivers. It
makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of
networking subsystem APIs. It has a somewhat novel multicore queueing
system designed for maximum throughput and minimal latency of encryption
operations, but it is implemented modestly using workqueues and NAPI.
Configuration is done via generic Netlink, and following a review from
the Netlink maintainer a year ago, several high profile userspace tools
have already implemented the API.

This commit also comes with several different tests, both in-kernel
tests and out-of-kernel tests based on network namespaces, taking profit
of the fact that sockets used by WireGuard intentionally stay in the
namespace the WireGuard interface was originally created, exactly like
the semantics of userspace tun devices. See wireguard.com/netns/ for
pictures and examples.

The source code is fairly short, but rather than combining everything
into a single file, WireGuard is developed as cleanly separable files,
making auditing and comprehension easier. Things are laid out as
follows:

  * noise.[ch], cookie.[ch], messages.h: These implement the bulk of the
    cryptographic aspects of the protocol, and are mostly data-only in
    nature, taking in buffers of bytes and spitting out buffers of
    bytes. They also handle reference counting for their various shared
    pieces of data, like keys and key lists.

  * ratelimiter.[ch]: Used as an integral part of cookie.[ch] for
    ratelimiting certain types of cryptographic operations in accordance
    with particular WireGuard semantics.

  * allowedips.[ch], peerlookup.[ch]: The main lookup structures of
    WireGuard, the former being trie-like with particular semantics, an
    integral part of the design of the protocol, and the latter just
    being nice helper functions around the various hashtables we use.

  * device.[ch]: Implementation of functions for the netdevice and for
    rtnl, responsible for maintaining the life of a given interface and
    wiring it up to the rest of WireGuard.

  * peer.[ch]: Each interface has a list of peers, with helper functions
    available here for creation, destruction, and reference counting.

  * socket.[ch]: Implementation of functions related to udp_socket and
    the general set of kernel socket APIs, for sending and receiving
    ciphertext UDP packets, and taking care of WireGuard-specific sticky
    socket routing semantics for the automatic roaming.

  * netlink.[ch]: Userspace API entry point for configuring WireGuard
    peers and devices. The API has been implemented by several userspace
    tools and network management utility, and the WireGuard project
    distributes the basic wg(8) tool.

  * queueing.[ch]: Shared function on the rx and tx path for handling
    the various queues used in the multicore algorithms.

  * send.c: Handles encrypting outgoing packets in parallel on
    multiple cores, before sending them in order on a single core, via
    workqueues and ring buffers. Also handles sending handshake and cookie
    messages as part of the protocol, in parallel.

  * receive.c: Handles decrypting incoming packets in parallel on
    multiple cores, before passing them off in order to be ingested via
    the rest of the networking subsystem with GRO via the typical NAPI
    poll function. Also handles receiving handshake and cookie messages
    as part of the protocol, in parallel.

  * timers.[ch]: Uses the timer wheel to implement protocol particular
    event timeouts, and gives a set of very simple event-driven entry
    point functions for callers.

  * main.c, version.h: Initialization and deinitialization of the module.

  * selftest/*.h: Runtime unit tests for some of the most security
    sensitive functions.

  * tools/testing/selftests/wireguard/netns.sh: Aforementioned testing
    script using network namespaces.

This commit aims to be as self-contained as possible, implementing
WireGuard as a standalone module not needing much special handling or
coordination from the network subsystem. I expect for future
optimizations to the network stack to positively improve WireGuard, and
vice-versa, but for the time being, this exists as intentionally
standalone.

We introduce a menu option for CONFIG_WIREGUARD, as well as providing a
verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: David Miller <davem@davemloft.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-08 17:48:42 -08:00
Linus Torvalds
95e6ba5133 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) More jumbo frame fixes in r8169, from Heiner Kallweit.

 2) Fix bpf build in minimal configuration, from Alexei Starovoitov.

 3) Use after free in slcan driver, from Jouni Hogander.

 4) Flower classifier port ranges don't work properly in the HW offload
    case, from Yoshiki Komachi.

 5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin.

 6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk.

 7) Fix flow dissection in dsa TX path, from Alexander Lobakin.

 8) Stale syncookie timestampe fixes from Guillaume Nault.

[ Did an evil merge to silence a warning introduced by this pull - Linus ]

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
  r8169: fix rtl_hw_jumbo_disable for RTL8168evl
  net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()
  r8169: add missing RX enabling for WoL on RTL8125
  vhost/vsock: accept only packets with the right dst_cid
  net: phy: dp83867: fix hfs boot in rgmii mode
  net: ethernet: ti: cpsw: fix extra rx interrupt
  inet: protect against too small mtu values.
  gre: refetch erspan header from skb->data after pskb_may_pull()
  pppoe: remove redundant BUG_ON() check in pppoe_pernet
  tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
  tcp: tighten acceptance of ACKs not matching a child socket
  tcp: fix rejected syncookies due to stale timestamps
  lpc_eth: kernel BUG on remove
  tcp: md5: fix potential overestimation of TCP option space
  net: sched: allow indirect blocks to bind to clsact in TC
  net: core: rename indirect block ingress cb function
  net-sysfs: Call dev_hold always in netdev_queue_add_kobject
  net: dsa: fix flow dissection on Tx path
  net/tls: Fix return values to avoid ENOTSUPP
  net: avoid an indirect call in ____sys_recvmsg()
  ...
2019-12-08 13:28:11 -08:00
Valentin Vidic
4a5cdc604b net/tls: Fix return values to avoid ENOTSUPP
ENOTSUPP is not available in userspace, for example:

  setsockopt failed, 524, Unknown error 524

Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-06 20:15:39 -08:00
Yonghong Song
8f9081c925 selftests/bpf: Add a fexit/bpf2bpf test with target bpf prog no callees
The existing fexit_bpf2bpf test covers the target progrm with callees.
This patch added a test for the target program without callees.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191205010607.177904-1-yhs@fb.com
2019-12-04 21:34:42 -08:00
Heiher
f2728fe80c selftests: add epoll selftests
This adds the promised selftest for epoll.  It will verify the wakeups
of epoll.  Including leaf and nested mode, epoll_wait() and poll() and
multi-threads.

Link: http://lkml.kernel.org/r/20191009121518.4027-1-r@hev.cc
Signed-off-by: hev <r@hev.cc>
Reviewed-by: Roman Penyaev <rpenyaev@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-04 19:44:13 -08:00
Stanislav Fomichev
ef8c84effc selftests/bpf: De-flake test_tcpbpf
It looks like BPF program that handles BPF_SOCK_OPS_STATE_CB state
can race with the bpf_map_lookup_elem("global_map"); I sometimes
see the failures in this test and re-running helps.

Since we know that we expect the callback to be called 3 times (one
time for listener socket, two times for both ends of the connection),
let's export this number and add simple retry logic around that.

Also, let's make EXPECT_EQ() not return on failure, but continue
evaluating all conditions; that should make potential debugging
easier.

With this fix in place I don't observe the flakiness anymore.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Lawrence Brakmo <brakmo@fb.com>
Link: https://lore.kernel.org/bpf/20191204190955.170934-1-sdf@google.com
2019-12-04 18:01:05 -08:00
Stanislav Fomichev
6bf6affe18 selftests/bpf: Bring back c++ include/link test
Commit 5c26f9a783 ("libbpf: Don't use cxx to test_libpf target")
converted existing c++ test to c. We still want to include and
link against libbpf from c++ code, so reinstate this test back,
this time in a form of a selftest with a clear comment about
its purpose.

v2:
* -lelf -> $(LDLIBS) (Andrii Nakryiko)

Fixes: 5c26f9a783 ("libbpf: Don't use cxx to test_libpf target")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191202215931.248178-1-sdf@google.com
2019-12-04 17:57:55 -08:00
Stanislav Fomichev
01d434ce98 selftests/bpf: Don't hard-code root cgroup id
Commit 40430452fd ("kernfs: use 64bit inos if ino_t is 64bit") changed
the way cgroup ids are exposed to the userspace. Instead of assuming
fixed root id, let's query it.

Fixes: 40430452fd ("kernfs: use 64bit inos if ino_t is 64bit")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191202200143.250793-1-sdf@google.com
2019-12-04 17:56:22 -08:00
Linus Torvalds
c3bed3b20e Merge tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Warn if a host bridge has no NUMA info (Yunsheng Lin)

   - Add PCI_STD_NUM_BARS for the number of standard BARs (Denis
     Efremov)

  Resource management:

   - Fix boot-time Embedded Controller GPE storm caused by incorrect
     resource assignment after ACPI Bus Check Notification (Mika
     Westerberg)

   - Protect pci_reassign_bridge_resources() against concurrent
     addition/removal (Benjamin Herrenschmidt)

   - Fix bridge dma_ranges resource list cleanup (Rob Herring)

   - Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters to control
     the MMIO and prefetchable MMIO window sizes of hotplug bridges
     independently (Nicholas Johnson)

   - Fix MMIO/MMIO_PREF window assignment that assigned more space than
     desired (Nicholas Johnson)

   - Only enforce bus numbers from bridge EA if the bridge has EA
     devices downstream (Subbaraya Sundeep)

   - Consolidate DT "dma-ranges" parsing and convert all host drivers to
     use shared parsing (Rob Herring)

  Error reporting:

   - Restore AER capability after resume (Mayurkumar Patel)

   - Add PoisonTLPBlocked AER counter (Rajat Jain)

   - Use for_each_set_bit() to simplify AER code (Andy Shevchenko)

   - Fix AER kernel-doc (Andy Shevchenko)

   - Add "pcie_ports=dpc-native" parameter to allow native use of DPC
     even if platform didn't grant control over AER (Olof Johansson)

  Hotplug:

   - Avoid returning prematurely from sysfs requests to enable or
     disable a PCIe hotplug slot (Lukas Wunner)

   - Don't disable interrupts twice when suspending hotplug ports (Mika
     Westerberg)

   - Fix deadlocks when PCIe ports are hot-removed while suspended (Mika
     Westerberg)

  Power management:

   - Remove unnecessary ASPM locking (Bjorn Helgaas)

   - Add support for disabling L1 PM Substates (Heiner Kallweit)

   - Allow re-enabling Clock PM after it has been disabled (Heiner
     Kallweit)

   - Add sysfs attributes for controlling ASPM link states (Heiner
     Kallweit)

   - Remove CONFIG_PCIEASPM_DEBUG, including "link_state" and "clk_ctl"
     sysfs files (Heiner Kallweit)

   - Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on
     USB 2.0 or 1.1 connect events (Kai-Heng Feng)

   - Move power state check out of pci_msi_supported() (Bjorn Helgaas)

   - Fix incorrect MSI-X masking on resume and revert related nvme quirk
     for Kingston NVME SSD running FW E8FK11.T (Jian-Hong Pan)

   - Always return devices to D0 when thawing to fix hibernation with
     drivers like mlx4 that used legacy power management (previously we
     only did it for drivers with new power management ops) (Dexuan Cui)

   - Clear PCIe PME Status even for legacy power management (Bjorn
     Helgaas)

   - Fix PCI PM documentation errors (Bjorn Helgaas)

   - Use dev_printk() for more power management messages (Bjorn Helgaas)

   - Apply D2 delay as milliseconds, not microseconds (Bjorn Helgaas)

   - Convert xen-platform from legacy to generic power management (Bjorn
     Helgaas)

   - Removed unused .resume_early() and .suspend_late() legacy power
     management hooks (Bjorn Helgaas)

   - Rearrange power management code for clarity (Rafael J. Wysocki)

   - Decode power states more clearly ("4" or "D4" really refers to
     "D3cold") (Bjorn Helgaas)

   - Notice when reading PM Control register returns an error (~0)
     instead of interpreting it as being in D3hot (Bjorn Helgaas)

   - Add missing link delays required by the PCIe spec (Mika Westerberg)

  Virtualization:

   - Move pci_prg_resp_pasid_required() to CONFIG_PCI_PRI (Bjorn
     Helgaas)

   - Allow VFs to use PRI (the PF PRI is shared by the VFs, but the code
     previously didn't recognize that) (Kuppuswamy Sathyanarayanan)

   - Allow VFs to use PASID (the PF PASID capability is shared by the
     VFs, but the code previously didn't recognize that) (Kuppuswamy
     Sathyanarayanan)

   - Disconnect PF and VF ATS enablement, since ATS in PFs and
     associated VFs can be enabled independently (Kuppuswamy
     Sathyanarayanan)

   - Cache PRI and PASID capability offsets (Kuppuswamy Sathyanarayanan)

   - Cache the PRI PRG Response PASID Required bit (Bjorn Helgaas)

   - Consolidate ATS declarations in linux/pci-ats.h (Krzysztof
     Wilczynski)

   - Remove unused PRI and PASID stubs (Bjorn Helgaas)

   - Removed unnecessary EXPORT_SYMBOL_GPL() from ATS, PRI, and PASID
     interfaces that are only used by built-in IOMMU drivers (Bjorn
     Helgaas)

   - Hide PRI and PASID state restoration functions used only inside the
     PCI core (Bjorn Helgaas)

   - Add a DMA alias quirk for the Intel VCA NTB (Slawomir Pawlowski)

   - Serialize sysfs sriov_numvfs reads vs writes (Pierre Crégut)

   - Update Cavium ACS quirk for ThunderX2 and ThunderX3 (George
     Cherian)

   - Fix the UPDCR register address in the Intel ACS quirk (Steffen
     Liebergeld)

   - Unify ACS quirk implementations (Bjorn Helgaas)

  Amlogic Meson host bridge driver:

   - Fix meson PERST# GPIO polarity problem (Remi Pommarel)

   - Add DT bindings for Amlogic Meson G12A (Neil Armstrong)

   - Fix meson clock names to match DT bindings (Neil Armstrong)

   - Add meson support for Amlogic G12A SoC with separate shared PHY
     (Neil Armstrong)

   - Add meson extended PCIe PHY functions for Amlogic G12A USB3+PCIe
     combo PHY (Neil Armstrong)

   - Add arm64 DT for Amlogic G12A PCIe controller node (Neil Armstrong)

   - Add commented-out description of VIM3 USB3/PCIe mux in arm64 DT
     (Neil Armstrong)

  Broadcom iProc host bridge driver:

   - Invalidate iProc PAXB address mapping before programming it
     (Abhishek Shah)

   - Fix iproc-msi and mvebu __iomem annotations (Ben Dooks)

  Cadence host bridge driver:

   - Refactor Cadence PCIe host controller to use as a library for both
     host and endpoint (Tom Joseph)

  Freescale Layerscape host bridge driver:

   - Add layerscape LS1028a support (Xiaowei Bao)

  Intel VMD host bridge driver:

   - Add VMD bus 224-255 restriction decode (Jon Derrick)

   - Add VMD 8086:9A0B device ID (Jon Derrick)

   - Remove Keith from VMD maintainer list (Keith Busch)

  Marvell ARMADA 3700 / Aardvark host bridge driver:

   - Use LTSSM state to build link training flag since Aardvark doesn't
     implement the Link Training bit (Remi Pommarel)

   - Delay before training Aardvark link in case PERST# was asserted
     before the driver probe (Remi Pommarel)

   - Fix Aardvark issues with Root Control reads and writes (Remi
     Pommarel)

   - Don't rely on jiffies in Aardvark config access path since
     interrupts may be disabled (Remi Pommarel)

   - Fix Aardvark big-endian support (Grzegorz Jaszczyk)

  Marvell ARMADA 370 / XP host bridge driver:

   - Make mvebu_pci_bridge_emul_ops static (Ben Dooks)

  Microsoft Hyper-V host bridge driver:

   - Add hibernation support for Hyper-V virtual PCI devices (Dexuan
     Cui)

   - Track Hyper-V pci_protocol_version per-hbus, not globally (Dexuan
     Cui)

   - Avoid kmemleak false positive on hv hbus buffer (Dexuan Cui)

  Mobiveil host bridge driver:

   - Change mobiveil csr_read()/write() function names that conflict
     with riscv arch functions (Kefeng Wang)

  NVIDIA Tegra host bridge driver:

   - Fix Tegra CLKREQ dependency programming (Vidya Sagar)

  Renesas R-Car host bridge driver:

   - Remove unnecessary header include from rcar (Andrew Murray)

   - Tighten register index checking for rcar inbound range programming
     (Marek Vasut)

   - Fix rcar inbound range alignment calculation to improve packing of
     multiple entries (Marek Vasut)

   - Update rcar MACCTLR setting to match documentation (Yoshihiro
     Shimoda)

   - Clear bit 0 of MACCTLR before PCIETCTLR.CFINIT per manual
     (Yoshihiro Shimoda)

   - Add Marek Vasut and Yoshihiro Shimoda as R-Car maintainers (Simon
     Horman)

  Rockchip host bridge driver:

   - Make rockchip 0V9 and 1V8 power regulators non-optional (Robin
     Murphy)

  Socionext UniPhier host bridge driver:

   - Set uniphier to host (RC) mode always (Kunihiko Hayashi)

  Endpoint drivers:

   - Fix endpoint driver sign extension problem when shifting page
     number to phys_addr_t (Alan Mikhak)

  Misc:

   - Add NumaChip SPDX header (Krzysztof Wilczynski)

   - Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski)

   - Remove unused includes (Krzysztof Wilczynski)

   - Removed unused sysfs attribute groups (Ben Dooks)

   - Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas)

   - Add PCIe Link Control 2 register field definitions to replace magic
     numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas)

   - Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and
     Radeon CIK/SI PCIe Gen3 link training (Bjorn Helgaas)

   - Use pcie_capability_read_word() instead of pci_read_config_word()
     in AMDGPU and Radeon CIK/SI (Frederick Lawler)

   - Remove unused pci_irq_get_node() Greg Kroah-Hartman)

   - Make asm/msi.h mandatory and simplify PCI_MSI_IRQ_DOMAIN Kconfig
     (Palmer Dabbelt, Michal Simek)

   - Read all 64 bits of Switchtec part_event_bitmap (Logan Gunthorpe)

   - Fix erroneous intel-iommu dependency on CONFIG_AMD_IOMMU (Bjorn
     Helgaas)

   - Fix bridge emulation big-endian support (Grzegorz Jaszczyk)

   - Fix dwc find_next_bit() usage (Niklas Cassel)

   - Fix pcitest.c fd leak (Hewenliang)

   - Fix typos and comments (Bjorn Helgaas)

   - Fix Kconfig whitespace errors (Krzysztof Kozlowski)"

* tag 'pci-v5.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (160 commits)
  PCI: Remove PCI_MSI_IRQ_DOMAIN architecture whitelist
  asm-generic: Make msi.h a mandatory include/asm header
  Revert "nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T"
  PCI/MSI: Fix incorrect MSI-X masking on resume
  PCI/MSI: Move power state check out of pci_msi_supported()
  PCI/MSI: Remove unused pci_irq_get_node()
  PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer
  PCI: hv: Change pci_protocol_version to per-hbus
  PCI: hv: Add hibernation support
  PCI: hv: Reorganize the code in preparation of hibernation
  MAINTAINERS: Remove Keith from VMD maintainer
  PCI/ASPM: Remove PCIEASPM_DEBUG Kconfig option and related code
  PCI/ASPM: Add sysfs attributes for controlling ASPM link states
  PCI: Fix indentation
  drm/radeon: Prefer pcie_capability_read_word()
  drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions
  drm/radeon: Correct Transmit Margin masks
  drm/amdgpu: Prefer pcie_capability_read_word()
  PCI: uniphier: Set mode register to host mode
  drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
  ...
2019-12-03 13:58:22 -08:00
Linus Torvalds
e30dbe50dc Merge tag 'linux-kselftest-5.5-rc1-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull more kselftest fixes from Shuah Khan:
 "This second Kselftest fixes update for Linux 5.5-rc1 consists of an
  urgent revert to fix regression in CI coverage"

* tag 'linux-kselftest-5.5-rc1-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  Revert "selftests: Fix O= and KBUILD_OUTPUT handling for relative paths"
2019-12-02 17:29:25 -08:00
David S. Miller
734c7022ad Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-12-02

The following pull-request contains BPF updates for your *net* tree.

We've added 10 non-merge commits during the last 6 day(s) which contain
a total of 10 files changed, 60 insertions(+), 51 deletions(-).

The main changes are:

1) Fix vmlinux BTF generation for binutils pre v2.25, from Stanislav Fomichev.

2) Fix libbpf global variable relocation to take symbol's st_value offset
   into account, from Andrii Nakryiko.

3) Fix libbpf build on powerpc where check_abi target fails due to different
   readelf output format, from Aurelien Jarno.

4) Don't set BPF insns RO for the case when they are JITed in order to avoid
   fragmenting the direct map, from Daniel Borkmann.

5) Fix static checker warning in btf_distill_func_proto() as well as a build
   error due to empty enum when BPF is compiled out, from Alexei Starovoitov.

6) Fix up generation of bpf_helper_defs.h for perf, from Arnaldo Carvalho de Melo.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-02 10:50:29 -08:00
Aurelien Jarno
3464afdf11 libbpf: Fix readelf output parsing on powerpc with recent binutils
On powerpc with recent versions of binutils, readelf outputs an extra
field when dumping the symbols of an object file. For example:

    35: 0000000000000838    96 FUNC    LOCAL  DEFAULT [<localentry>: 8]     1 btf_is_struct

The extra "[<localentry>: 8]" prevents the GLOBAL_SYM_COUNT variable to
be computed correctly and causes the check_abi target to fail.

Fix that by looking for the symbol name in the last field instead of the
8th one. This way it should also cope with future extra fields.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/bpf/20191201195728.4161537-1-aurelien@aurel32.net
2019-12-02 10:31:54 +01:00
Linus Torvalds
596cf45cbf Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
 "Incoming:

   - a small number of updates to scripts/, ocfs2 and fs/buffer.c

   - most of MM

  I still have quite a lot of material (mostly not MM) staged after
  linux-next due to -next dependencies. I'll send those across next week
  as the preprequisites get merged up"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (135 commits)
  mm/page_io.c: annotate refault stalls from swap_readpage
  mm/Kconfig: fix trivial help text punctuation
  mm/Kconfig: fix indentation
  mm/memory_hotplug.c: remove __online_page_set_limits()
  mm: fix typos in comments when calling __SetPageUptodate()
  mm: fix struct member name in function comments
  mm/shmem.c: cast the type of unmap_start to u64
  mm: shmem: use proper gfp flags for shmem_writepage()
  mm/shmem.c: make array 'values' static const, makes object smaller
  userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
  fs/userfaultfd.c: wp: clear VM_UFFD_MISSING or VM_UFFD_WP during userfaultfd_register()
  userfaultfd: wrap the common dst_vma check into an inlined function
  userfaultfd: remove unnecessary WARN_ON() in __mcopy_atomic_hugetlb()
  userfaultfd: use vma_pagesize for all huge page size calculation
  mm/madvise.c: use PAGE_ALIGN[ED] for range checking
  mm/madvise.c: replace with page_size() in madvise_inject_error()
  mm/mmap.c: make vma_merge() comment more easy to understand
  mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops
  autonuma: reduce cache footprint when scanning page tables
  autonuma: fix watermark checking in migrate_balanced_pgdat()
  ...
2019-12-01 20:36:41 -08:00
Linus Torvalds
c3bfc5dd73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix several scatter gather list issues in kTLS code, from Jakub
    Kicinski.

 2) macb driver device remove has to kill the hresp_err_tasklet. From
    Chuhong Yuan.

 3) Several memory leak and reference count bug fixes in tipc, from Tung
    Nguyen.

 4) Fix mlx5 build error w/o ipv6, from Yue Haibing.

 5) Fix jumbo frame and other regressions in r8169, from Heiner
    Kallweit.

 6) Undo some BUG_ON()'s and replace them with WARN_ON_ONCE and proper
    error propagation/handling. From Paolo Abeni.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (24 commits)
  openvswitch: remove another BUG_ON()
  openvswitch: drop unneeded BUG_ON() in ovs_flow_cmd_build_info()
  net: phy: realtek: fix using paged operations with RTL8105e / RTL8208
  r8169: fix resume on cable plug-in
  r8169: fix jumbo configuration for RTL8168evl
  net: emulex: benet: indent a Kconfig depends continuation line
  selftests: forwarding: fix race between packet receive and tc check
  net: sched: fix `tc -s class show` no bstats on class with nolock subqueues
  net: ethernet: ti: ale: ensure vlan/mdb deleted when no members
  net/mlx5e: Fix build error without IPV6
  selftests: pmtu: use -oneline for ip route list cache
  tipc: fix duplicate SYN messages under link congestion
  tipc: fix wrong timeout input for tipc_wait_for_cond()
  tipc: fix wrong socket reference counter after tipc_sk_timeout() returns
  tipc: fix potential memory leak in __tipc_sendmsg()
  net: macb: add missed tasklet_kill
  selftests: bpf: correct perror strings
  selftests: bpf: test_sockmap: handle file creation failures gracefully
  net/tls: use sg_next() to walk sg entries
  net/tls: remove the dead inplace_crypto code
  ...
2019-12-01 20:35:03 -08:00
Linus Torvalds
e5b3fc125d Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Various fixes:

   - Fix the PAT performance regression that downgraded write-combining
     device memory regions to uncached.

   - There's been a number of bugs in 32-bit double fault handling -
     hopefully all fixed now.

   - Fix an LDT crash

   - Fix an FPU over-optimization that broke with GCC9 code
     optimizations.

   - Misc cleanups"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm/pat: Fix off-by-one bugs in interval tree search
  x86/ioperm: Save an indentation level in tss_update_io_bitmap()
  x86/fpu: Don't cache access to fpu_fpregs_owner_ctx
  x86/entry/32: Remove unused 'restore_all_notrace' local label
  x86/ptrace: Document FSBASE and GSBASE ABI oddities
  x86/ptrace: Remove set_segment_reg() implementations for current
  x86/traps: die() instead of panicking on a double fault
  x86/doublefault/32: Rewrite the x86_32 #DF handler and unify with 64-bit
  x86/doublefault/32: Move #DF stack and TSS to cpu_entry_area
  x86/doublefault/32: Rename doublefault.c to doublefault_32.c
  x86/traps: Disentangle the 32-bit and 64-bit doublefault code
  lkdtm: Add a DOUBLE_FAULT crash type on x86
  selftests/x86/single_step_syscall: Check SYSENTER directly
  x86/mm/32: Sync only to VMALLOC_END in vmalloc_sync_all()
2019-12-01 19:05:07 -08:00
Linus Torvalds
b7fcf31f70 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:

 - Make /sys/devices/cpu/rdpmc based RDPMC enforcement more
   instantaneous

 - decoder: Update the Intel opcode map

 - Various tooling fixes, including a few late optimizations and
   cleanups.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  perf script: Fix invalid LBR/binary mismatch error
  perf script: Fix brstackinsn for AUXTRACE
  perf affinity: Add infrastructure to save/restore affinity
  perf pmu: Use file system cache to optimize sysfs access
  perf regs: Make perf_reg_name() return "unknown" instead of NULL
  perf diff: Use llabs() with 64-bit values
  perf diff: Use llabs() with 64-bit values
  perf/x86: Implement immediate enforcement of /sys/devices/cpu/rdpmc value of 0
  perf tools: Allow to link with libbpf dynamicaly
  perf tests: Rename tests/map_groups.c to tests/maps.c
  perf tests: Rename thread-mg-share to thread-maps-share
  perf maps: Rename map_groups.h to maps.h
  perf maps: Rename 'mg' variables to 'maps'
  perf map_symbol: Rename ms->mg to ms->maps
  perf addr_location: Rename al->mg to al->maps
  perf thread: Rename thread->mg to thread->maps
  perf maps: Merge 'struct maps' with 'struct map_groups'
  x86/insn: perf tools: Add some more instructions to the new instructions test
  x86/insn: Add some more Intel instructions to the opcode map
  perf map: Remove unused functions
  ...
2019-12-01 18:49:57 -08:00
Linus Torvalds
67b8ed29e0 Merge tag 'platform-drivers-x86-v5.5-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver updates from Andy Shevchenko:

 - New bootctl driver for Mellanox BlueField SoC.

 - New driver to support System76 laptops.

 - Temperature monitoring and fan control on Acer Aspire 7551 is now
   supported.

 - Previously the Huawei driver handled only hotkeys. After the
   conversion to WMI it has been expanded to support newer laptop
   models.

 - Big refactoring of intel-speed-select tools allows to use it on Intel
   CascadeLake-N systems.

 - Touchscreen support for ezpad 6 m4 and Schneider SCT101CTM tablets

 - Miscellaneous clean ups and fixes here and there.

* tag 'platform-drivers-x86-v5.5-1' of git://git.infradead.org/linux-platform-drivers-x86: (59 commits)
  platform/x86: hp-wmi: Fix ACPI errors caused by passing 0 as input size
  platform/x86: hp-wmi: Fix ACPI errors caused by too small buffer
  platform/x86: intel_pmc_core: Add Comet Lake (CML) platform support to intel_pmc_core driver
  platform/x86: intel_pmc_core: Fix the SoC naming inconsistency
  platform/mellanox: Fix Kconfig indentation
  tools/power/x86/intel-speed-select: Display TRL buckets for just base config level
  tools/power/x86/intel-speed-select: Ignore missing config level
  platform/x86: touchscreen_dmi: Add info for the ezpad 6 m4 tablet
  tools/power/x86/intel-speed-select: Increment version
  tools/power/x86/intel-speed-select: Use core count for base-freq mask
  tools/power/x86/intel-speed-select: Support platform with limited Intel(R) Speed Select
  tools/power/x86/intel-speed-select: Use Frequency weight for CLOS
  tools/power/x86/intel-speed-select: Make CLOS frequency in MHz
  tools/power/x86/intel-speed-select: Use mailbox for CLOS_PM_QOS_CONFIG
  tools/power/x86/intel-speed-select: Auto mode for CLX
  tools/power/x86/intel-speed-select: Correct CLX-N frequency units
  tools/power/x86/intel-speed-select: Change display of "avx" to "avx2"
  tools/power/x86/intel-speed-select: Extend command set for perf-profile
  Add touchscreen platform data for the Schneider SCT101CTM tablet
  platform/x86: intel_int0002_vgpio: Pass irqchip when adding gpiochip
  ...
2019-12-01 18:24:25 -08:00
Anders Roxell
746dd4012d selftests: vm: add fragment CONFIG_TEST_VMALLOC
When running test_vmalloc.sh smoke the following print out states that
the fragment is missing.

 # ./test_vmalloc.sh: You must have the following enabled in your kernel:
 # CONFIG_TEST_VMALLOC=m

Rework to add the fragment 'CONFIG_TEST_VMALLOC=m' to the config file.

Link: http://lkml.kernel.org/r/20190916095217.19665-1-anders.roxell@linaro.org
Fixes: a05ef00c97 ("selftests/vm: add script helper for CONFIG_TEST_VMALLOC_MODULE")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-01 12:59:05 -08:00
Joel Fernandes (Google)
2e53c4e1c8 memfd: add test for COW on MAP_PRIVATE and F_SEAL_FUTURE_WRITE mappings
In this test, the parent and child both have writable private mappings.
The test shows that without the patch in this series, the parent and
child shared the same memory which is incorrect.  In other words, COW
needs to be triggered so any writes to child's copy stays local to the
child.

Link: http://lkml.kernel.org/r/20191107195355.80608-2-joel@joelfernandes.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Nicolas Geoffray <ngeoffray@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-12-01 12:59:03 -08:00
Linus Torvalds
b94ae8ad9f Merge tag 'seccomp-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
 "Mostly this is implementing the new flag SECCOMP_USER_NOTIF_FLAG_CONTINUE,
  but there are cleanups as well.

   - implement SECCOMP_USER_NOTIF_FLAG_CONTINUE (Christian Brauner)

   - fixes to selftests (Christian Brauner)

   - remove secure_computing() argument (Christian Brauner)"

* tag 'seccomp-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: rework define for SECCOMP_USER_NOTIF_FLAG_CONTINUE
  seccomp: fix SECCOMP_USER_NOTIF_FLAG_CONTINUE test
  seccomp: simplify secure_computing()
  seccomp: test SECCOMP_USER_NOTIF_FLAG_CONTINUE
  seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE
  seccomp: avoid overflow in implicit constant conversion
2019-11-30 17:23:16 -08:00
Linus Torvalds
0dd0c8f7db Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Sasha Levin:

 - support for new VMBus protocols (Andrea Parri)

 - hibernation support (Dexuan Cui)

 - latency testing framework (Branden Bonaby)

 - decoupling Hyper-V page size from guest page size (Himadri Pandya)

* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits)
  Drivers: hv: vmbus: Fix crash handler reset of Hyper-V synic
  drivers/hv: Replace binary semaphore with mutex
  drivers: iommu: hyperv: Make HYPERV_IOMMU only available on x86
  HID: hyperv: Add the support of hibernation
  hv_balloon: Add the support of hibernation
  x86/hyperv: Implement hv_is_hibernation_supported()
  Drivers: hv: balloon: Remove dependencies on guest page size
  Drivers: hv: vmbus: Remove dependencies on guest page size
  x86: hv: Add function to allocate zeroed page for Hyper-V
  Drivers: hv: util: Specify ring buffer size using Hyper-V page size
  Drivers: hv: Specify receive buffer size using Hyper-V page size
  tools: hv: add vmbus testing tool
  drivers: hv: vmbus: Introduce latency testing
  video: hyperv: hyperv_fb: Support deferred IO for Hyper-V frame buffer driver
  video: hyperv: hyperv_fb: Obtain screen resolution from Hyper-V host
  hv_netvsc: Add the support of hibernation
  hv_sock: Add the support of hibernation
  video: hyperv_fb: Add the support of hibernation
  scsi: storvsc: Add the support of hibernation
  Drivers: hv: vmbus: Add module parameter to cap the VMBus version
  ...
2019-11-30 14:50:51 -08:00
Linus Torvalds
7794b1d418 Merge tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
 "Highlights:

   - Infrastructure for secure boot on some bare metal Power9 machines.
     The firmware support is still in development, so the code here
     won't actually activate secure boot on any existing systems.

   - A change to xmon (our crash handler / pseudo-debugger) to restrict
     it to read-only mode when the kernel is lockdown'ed, otherwise it's
     trivial to drop into xmon and modify kernel data, such as the
     lockdown state.

   - Support for KASLR on 32-bit BookE machines (Freescale / NXP).

   - Fixes for our flush_icache_range() and __kernel_sync_dicache()
     (VDSO) to work with memory ranges >4GB.

   - Some reworks of the pseries CMM (Cooperative Memory Management)
     driver to make it behave more like other balloon drivers and enable
     some cleanups of generic mm code.

   - A series of fixes to our hardware breakpoint support to properly
     handle unaligned watchpoint addresses.

  Plus a bunch of other smaller improvements, fixes and cleanups.

  Thanks to: Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V,
  Anthony Steinhauser, Cédric Le Goater, Chris Packham, Chris Smart,
  Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio
  Carvalho, Daniel Axtens, David Hildenbrand, Deb McLemore, Diana
  Craciun, Eric Richter, Geert Uytterhoeven, Greg Kroah-Hartman, Greg
  Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason Yan, Krzysztof
  Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M. Rodrigues,
  Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna
  Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes,
  Ravi Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth,
  Tyrel Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing"

* tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (144 commits)
  powerpc/fixmap: fix crash with HIGHMEM
  x86/efi: remove unused variables
  powerpc: Define arch_is_kernel_initmem_freed() for lockdep
  powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp
  powerpc: Avoid clang warnings around setjmp and longjmp
  powerpc: Don't add -mabi= flags when building with Clang
  powerpc: Fix Kconfig indentation
  powerpc/fixmap: don't clear fixmap area in paging_init()
  selftests/powerpc: spectre_v2 test must be built 64-bit
  powerpc/powernv: Disable native PCIe port management
  powerpc/kexec: Move kexec files into a dedicated subdir.
  powerpc/32: Split kexec low level code out of misc_32.S
  powerpc/sysdev: drop simple gpio
  powerpc/83xx: map IMMR with a BAT.
  powerpc/32s: automatically allocate BAT in setbat()
  powerpc/ioremap: warn on early use of ioremap()
  powerpc: Add support for GENERIC_EARLY_IOREMAP
  powerpc/fixmap: Use __fix_to_virt() instead of fix_to_virt()
  powerpc/8xx: use the fixmapped IMMR in cpm_reset()
  powerpc/8xx: add __init to cpm1 init functions
  ...
2019-11-30 14:35:43 -08:00
Jiri Pirko
408469d31e selftests: forwarding: fix race between packet receive and tc check
It is possible that tc stats get checked before the packet we check for
actually arrived into the interface and accounted for.
Fix it by checking for the expected result in a loop until
timeout is reached (by default 1 second).

Fixes: 07e5c75184 ("selftests: forwarding: Introduce tc flower matching tests")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-30 12:23:27 -08:00
Thadeu Lima de Souza Cascardo
2745aea675 selftests: pmtu: use -oneline for ip route list cache
Some versions of iproute2 will output more than one line per entry, which
will cause the test to fail, like:

TEST: ipv6: list and flush cached exceptions                        [FAIL]
  can't list cached exceptions

That happens, for example, with iproute2 4.15.0. When using the -oneline
option, this will work just fine:

TEST: ipv6: list and flush cached exceptions                        [ OK ]

This also works just fine with a more recent version of iproute2, like
5.4.0.

For some reason, two lines are printed for the IPv4 test no matter what
version of iproute2 is used. Use the same -oneline parameter there instead
of counting the lines twice.

Fixes: b964641e99 ("selftests: pmtu: Make list_flush_ipv6_exception test more demanding")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-29 12:29:55 -08:00
Jakub Kicinski
e5dc9dd325 selftests: bpf: correct perror strings
perror(str) is basically equivalent to
print("%s: %s\n", str, strerror(errno)).
New line or colon at the end of str is
a mistake/breaks formatting.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28 22:40:30 -08:00
Jakub Kicinski
4b67c51503 selftests: bpf: test_sockmap: handle file creation failures gracefully
test_sockmap creates a temporary file to use for sendpage.
this may fail for various reasons. Handle the error rather
than segfault.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28 22:40:29 -08:00
Jakub Kicinski
65190f7742 selftests/tls: add a test for fragmented messages
Add a sendmsg test with very fragmented messages. This should
fill up sk_msg and test the boundary conditions.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-28 22:40:29 -08:00
Ingo Molnar
e680a41fca Merge tag 'perf-core-for-mingo-5.5-20191128' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

perf script:

  Adrian Hunter:

  - Fix brstackinsn for AUXTRACE.

  - Fix invalid LBR/binary mismatch error.

perf diff:

  Arnaldo Carvalho de Melo:

  - Use llabs() with 64-bit values, fixing the build in some 32-bit
    architectures.

perf pmu:

  Andi Kleen:

  - Use file system cache to optimize sysfs access.

x86:

  Adrian Hunter:

  - Add some more Intel instructions to the opcode map and to the perf
    test entry:

      gf2p8affineinvqb, gf2p8affineqb, gf2p8mulb, v4fmaddps,
      v4fmaddss, v4fnmaddps, v4fnmaddss, vaesdec, vaesdeclast, vaesenc,
      vaesenclast, vcvtne2ps2bf16, vcvtneps2bf16, vdpbf16ps,
      vgf2p8affineinvqb, vgf2p8affineqb, vgf2p8mulb, vp2intersectd,
      vp2intersectq, vp4dpwssd, vp4dpwssds, vpclmulqdq, vpcompressb,
      vpcompressw, vpdpbusd, vpdpbusds, vpdpwssd, vpdpwssds, vpexpandb,
      vpexpandw, vpopcntb, vpopcntd, vpopcntq, vpopcntw, vpshldd, vpshldq,
      vpshldvd, vpshldvq, vpshldvw, vpshldw, vpshrdd, vpshrdq, vpshrdvd,
      vpshrdvq, vpshrdvw, vpshrdw, vpshufbitqmb.

perf affinity:

  Andi Kleen:

  - Add infrastructure to save/restore affinity

perf maps:

  Arnaldo Carvalho de Melo:

  - Merge 'struct maps' with 'struct map_groups', as there is a
    1x1 relationship, simplifying code overal.

perf build:

  Jiri Olsa:

  - Allow to link with libbpf dynamicaly.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-29 06:56:05 +01:00
Shuah Khan
f60b85e836 Revert "selftests: Fix O= and KBUILD_OUTPUT handling for relative paths"
This reverts commit 303e6218ec.

This patch breaks several CI use-cases that run kselftest builds
without using main Makefile. This fix depends on abs_objtree which
is undefined when kselftest build is invoked on selftests Makefile
without going through the main Makefile.

Revert this for now as this patch impacts selftest runs.

Fixes: 303e6218ec ("selftests: Fix O= and KBUILD_OUTPUT handling for relative paths")
Reported-by: Cristian Marussi <cristian.marussi@arm.com>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2019-11-28 16:27:44 -07:00
Bjorn Helgaas
48617f03c9 Merge branch 'remotes/lorenzo/pci/misc'
- Fix iproc-msi and mvebu __iomem annotations (Ben Dooks)

  - Make mvebu_pci_bridge_emul_ops static (Ben Dooks)

  - Add Marek Vasut and Yoshihiro Shimoda as R-Car maintainers (Simon
    Horman)

  - Fix pcitest.c fd leak (Hewenliang)

* remotes/lorenzo/pci/misc:
  tools: PCI: Fix fd leakage
  MAINTAINERS: Add Marek and Shimoda-san as R-Car PCIE co-maintainers
  PCI: mvebu: mvebu_pcie_map_registers __iomem fix
  PCI: mvebu: Make mvebu_pci_bridge_emul_ops static
  PCI: iproc-msi: Fix __iomem annotation in decode_msi_hwirq()
2019-11-28 08:54:54 -06:00
Adrian Hunter
5172672da0 perf script: Fix invalid LBR/binary mismatch error
The 'len' returned by grab_bb() includes an extra MAXINSN bytes to allow
for the last instruction, so the the final 'offs' will not be 'len'.
Fix the error condition logic accordingly.

Before:

  $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
  [ perf record: Woken up 19 times to write data ]
  [ perf record: Captured and wrote 2.274 MB perf.data ]
  $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
            grep 13759 [002]  8091.310257:       1862                                        instructions:uH:      5641d58069eb bmexec+0x86b (/bin/grep)
        bmexec+2485:
        00005641d5806b35                        jnz 0x5641d5806bd0              # MISPRED
        00005641d5806bd0                        movzxb  (%r13,%rdx,1), %eax
        00005641d5806bd6                        add %rdi, %rax
        00005641d5806bd9                        movzxb  -0x1(%rax), %edx
        00005641d5806bdd                        cmp %rax, %r14
        00005641d5806be0                        jnb 0x5641d58069c0              # MISPRED
        mismatch of LBR data and executable
        00005641d58069c0                        movzxb  (%r13,%rdx,1), %edi

After:

  $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
            grep 13759 [002]  8091.310257:       1862                                        instructions:uH:      5641d58069eb bmexec+0x86b (/bin/grep)
        bmexec+2485:
        00005641d5806b35                        jnz 0x5641d5806bd0              # MISPRED
        00005641d5806bd0                        movzxb  (%r13,%rdx,1), %eax
        00005641d5806bd6                        add %rdi, %rax
        00005641d5806bd9                        movzxb  -0x1(%rax), %edx
        00005641d5806bdd                        cmp %rax, %r14
        00005641d5806be0                        jnb 0x5641d58069c0              # MISPRED
        00005641d58069c0                        movzxb  (%r13,%rdx,1), %edi
        00005641d58069c6                        add %rax, %rdi

Fixes: e98df280bc ("perf script brstackinsn: Fix recovery from LBR/binary mismatch")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191127095631.15663-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:38 -03:00
Adrian Hunter
0cd032d3b5 perf script: Fix brstackinsn for AUXTRACE
brstackinsn must be allowed to be set by the user when AUX area data has
been captured because, in that case, the branch stack might be
synthesized on the fly. This fixes the following error:

Before:

  $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
  [ perf record: Woken up 19 times to write data ]
  [ perf record: Captured and wrote 2.274 MB perf.data ]
  $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
  Display of branch stack assembler requested, but non all-branch filter set
  Hint: run 'perf record -b ...'

After:

  $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
  [ perf record: Woken up 19 times to write data ]
  [ perf record: Captured and wrote 2.274 MB perf.data ]
  $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
            grep 13759 [002]  8091.310257:       1862                                        instructions:uH:      5641d58069eb bmexec+0x86b (/bin/grep)
        bmexec+2485:
        00005641d5806b35                        jnz 0x5641d5806bd0              # MISPRED
        00005641d5806bd0                        movzxb  (%r13,%rdx,1), %eax
        00005641d5806bd6                        add %rdi, %rax
        00005641d5806bd9                        movzxb  -0x1(%rax), %edx
        00005641d5806bdd                        cmp %rax, %r14
        00005641d5806be0                        jnb 0x5641d58069c0              # MISPRED
        mismatch of LBR data and executable
        00005641d58069c0                        movzxb  (%r13,%rdx,1), %edi

Fixes: 48d02a1d5c ("perf script: Add 'brstackinsn' for branch stacks")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191127095322.15417-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:38 -03:00
Andi Kleen
267ed5d859 perf affinity: Add infrastructure to save/restore affinity
The kernel perf subsystem has to IPI to the target CPU for many
operations. On systems with many CPUs and when managing many events the
overhead can be dominated by lots of IPIs.

An alternative is to set up CPU affinity in the perf tool, then set up
all the events for that CPU, and then move on to the next CPU.

Add some affinity management infrastructure to enable such a model.
Used in followon patches.

Committer notes:

Use zfree() in some places, add missing stdbool.h header, some minor
coding style changes.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191121001522.180827-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:38 -03:00
Andi Kleen
d96645821e perf pmu: Use file system cache to optimize sysfs access
pmu.c does a lot of redundant /sys accesses while parsing aliases
and probing for PMUs. On large systems with a lot of PMUs this
can get expensive (>2s):

  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
   27.25    1.227847           8    160888     16976 openat
   26.42    1.190481           7    164224    164077 stat

Add a cache to remember if specific file names exist or don't
exist, which eliminates most of this overhead.

Also optimize some stat() calls to be slightly cheaper access()

Resulting in:

    0.18    0.004166           2      1851       305 open
    0.08    0.001970           2       829       622 access

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191121001522.180827-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:38 -03:00
Arnaldo Carvalho de Melo
5b596e0ff0 perf regs: Make perf_reg_name() return "unknown" instead of NULL
To avoid breaking the build on arches where this is not wired up, at
least all the other features should be made available and when using
this specific routine, the "unknown" should point the user/developer to
the need to wire this up on this particular hardware architecture.

Detected in a container mipsel debian cross build environment, where it
shows up as:

  In file included from /usr/mipsel-linux-gnu/include/stdio.h:867,
                   from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
                   from util/session.c:13:
  In function 'printf',
      inlined from 'regs_dump__printf' at util/session.c:1103:3,
      inlined from 'regs__printf' at util/session.c:1131:2:
  /usr/mipsel-linux-gnu/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
    107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

cross compiler details:

  mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1 20190909

Also on mips64:

  In file included from /usr/mips64-linux-gnuabi64/include/stdio.h:867,
                   from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
                   from util/session.c:13:
  In function 'printf',
      inlined from 'regs_dump__printf' at util/session.c:1103:3,
      inlined from 'regs__printf' at util/session.c:1131:2,
      inlined from 'regs_user__printf' at util/session.c:1139:3,
      inlined from 'dump_sample' at util/session.c:1246:3,
      inlined from 'machines__deliver_event' at util/session.c:1421:3:
  /usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
    107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  In function 'printf',
      inlined from 'regs_dump__printf' at util/session.c:1103:3,
      inlined from 'regs__printf' at util/session.c:1131:2,
      inlined from 'regs_intr__printf' at util/session.c:1147:3,
      inlined from 'dump_sample' at util/session.c:1249:3,
      inlined from 'machines__deliver_event' at util/session.c:1421:3:
  /usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
    107 |   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

cross compiler details:

  mips64-linux-gnuabi64-gcc (Debian 9.2.1-8) 9.2.1 20190909

Fixes: 2bcd355b71 ("perf tools: Add interface to arch registers sets")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-95wjyv4o65nuaeweq31t7l1s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:38 -03:00
Arnaldo Carvalho de Melo
2b1ac6403f perf diff: Use llabs() with 64-bit values
To fix this build error on a debian mipsel cross build environment:

  builtin-diff.c: In function 'compute_cycles_diff':
  builtin-diff.c:649:10: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
    649 |    val = labs(pair->block_info->cycles_spark[i] -
        |          ^~~~

Fixes: cebf7d51a6 ("perf diff: Report noisy for cycles diff")
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pn7szy5uw384ntjgk6zckh6a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:38 -03:00
Arnaldo Carvalho de Melo
98e9324511 perf diff: Use llabs() with 64-bit values
To fix these build errors on a debian mipsel cross build environment:

  builtin-diff.c: In function 'block_cycles_diff_cmp':
  builtin-diff.c:550:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
    550 |  l = labs(left->diff.cycles);
        |      ^~~~
  builtin-diff.c:551:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
    551 |  r = labs(right->diff.cycles);
        |      ^~~~

Fixes: 99150a1faa ("perf diff: Use hists to manage basic blocks per symbol")
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pn7szy5uw384ntjgk6zckh6a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:37 -03:00
Alexei Starovoitov
7c3977d1e8 libbpf: Fix sym->st_value print on 32-bit arches
The st_value field is a 64-bit value and causing this error on 32-bit arches:

In file included from libbpf.c:52:
libbpf.c: In function 'bpf_program__record_reloc':
libbpf_internal.h:59:22: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'Elf64_Addr' {aka 'const long long unsigned int'} [-Werror=format=]

Fix it with (__u64) cast.

Fixes: 1f8e2bcb2c ("libbpf: Refactor relocation handling")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-11-27 17:46:56 -08:00
Arnaldo Carvalho de Melo
1fd450f992 libbpf: Fix up generation of bpf_helper_defs.h
$ make -C tools/perf build-test

does, ends up with these two problems:

  make[3]: *** No rule to make target '/tmp/tmp.zq13cHILGB/perf-5.3.0/include/uapi/linux/bpf.h', needed by 'bpf_helper_defs.h'.  Stop.
  make[3]: *** Waiting for unfinished jobs....
  make[2]: *** [Makefile.perf:757: /tmp/tmp.zq13cHILGB/perf-5.3.0/tools/lib/bpf/libbpf.a] Error 2
  make[2]: *** Waiting for unfinished jobs....

Because $(srcdir) points to the /tmp/tmp.zq13cHILGB/perf-5.3.0 directory
and we need '/tools/ after that variable, and after fixing this then we
get to another problem:

  /bin/sh: /home/acme/git/perf/tools/scripts/bpf_helpers_doc.py: No such file or directory
  make[3]: *** [Makefile:184: bpf_helper_defs.h] Error 127
  make[3]: *** Deleting file 'bpf_helper_defs.h'
    LD       /tmp/build/perf/libapi-in.o
  make[2]: *** [Makefile.perf:778: /tmp/build/perf/libbpf.a] Error 2
  make[2]: *** Waiting for unfinished jobs....

Because this requires something outside the tools/ directories that gets
collected into perf's detached tarballs, to fix it just add it to
tools/perf/MANIFEST, which this patch does, now it works for that case
and also for all these other cases.

Fixes: e01a75c159 ("libbpf: Move bpf_{helpers, helper_defs, endian, tracing}.h into libbpf")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-4pnkg2vmdvq5u6eivc887wen@git.kernel.org
Link: https://lore.kernel.org/bpf/20191126151045.GB19483@kernel.org
2019-11-27 16:52:31 -08:00
Andrii Nakryiko
53f8dd434b libbpf: Fix global variable relocation
Similarly to a0d7da26ce ("libbpf: Fix call relocation offset calculation
bug"), relocations against global variables need to take into account
referenced symbol's st_value, which holds offset into a corresponding data
section (and, subsequently, offset into internal backing map). For static
variables this offset is always zero and data offset is completely described
by respective instruction's imm field.

Convert a bunch of selftests to global variables. Previously they were relying
on `static volatile` trick to ensure Clang doesn't inline static variables,
which with global variables is not necessary anymore.

Fixes: 393cdfbee8 ("libbpf: Support initialized global variables")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191127200651.1381348-1-andriin@fb.com
2019-11-27 16:34:21 -08:00
Andrii Nakryiko
b568405856 libbpf: Fix Makefile' libbpf symbol mismatch diagnostic
Fix Makefile's diagnostic diff output when there is LIBBPF_API-versioned
symbols mismatch.

Fixes: 1bd6352459 ("libbpf: handle symbol versioning properly for libbpf.a")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191127200134.1360660-1-andriin@fb.com
2019-11-27 22:29:02 +01:00
Linus Torvalds
95f1fa9e34 Merge tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
 "New tracing features:

   - New PERMANENT flag to ftrace_ops when attaching a callback to a
     function.

     As /proc/sys/kernel/ftrace_enabled when set to zero will disable
     all attached callbacks in ftrace, this has a detrimental impact on
     live kernel tracing, as it disables all that it patched. If a
     ftrace_ops is registered to ftrace with the PERMANENT flag set, it
     will prevent ftrace_enabled from being disabled, and if
     ftrace_enabled is already disabled, it will prevent a ftrace_ops
     with PREMANENT flag set from being registered.

   - New register_ftrace_direct().

     As eBPF would like to register its own trampolines to be called by
     the ftrace nop locations directly, without going through the ftrace
     trampoline, this function has been added. This allows for eBPF
     trampolines to live along side of ftrace, perf, kprobe and live
     patching. It also utilizes the ftrace enabled_functions file that
     keeps track of functions that have been modified in the kernel, to
     allow for security auditing.

   - Allow for kernel internal use of ftrace instances.

     Subsystems in the kernel can now create and destroy their own
     tracing instances which allows them to have their own tracing
     buffer, and be able to record events without worrying about other
     users from writing over their data.

   - New seq_buf_hex_dump() that lets users use the hex_dump() in their
     seq_buf usage.

   - Notifications now added to tracing_max_latency to allow user space
     to know when a new max latency is hit by one of the latency
     tracers.

   - Wider spread use of generic compare operations for use of bsearch
     and friends.

   - More synthetic event fields may be defined (32 up from 16)

   - Use of xarray for architectures with sparse system calls, for the
     system call trace events.

  This along with small clean ups and fixes"

* tag 'trace-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (51 commits)
  tracing: Enable syscall optimization for MIPS
  tracing: Use xarray for syscall trace events
  tracing: Sample module to demonstrate kernel access to Ftrace instances.
  tracing: Adding new functions for kernel access to Ftrace instances
  tracing: Fix Kconfig indentation
  ring-buffer: Fix typos in function ring_buffer_producer
  ftrace: Use BIT() macro
  ftrace: Return ENOTSUPP when DYNAMIC_FTRACE_WITH_DIRECT_CALLS is not configured
  ftrace: Rename ftrace_graph_stub to ftrace_stub_graph
  ftrace: Add a helper function to modify_ftrace_direct() to allow arch optimization
  ftrace: Add helper find_direct_entry() to consolidate code
  ftrace: Add another check for match in register_ftrace_direct()
  ftrace: Fix accounting bug with direct->count in register_ftrace_direct()
  ftrace/selftests: Fix spelling mistake "wakeing" -> "waking"
  tracing: Increase SYNTH_FIELDS_MAX for synthetic_events
  ftrace/samples: Add a sample module that implements modify_ftrace_direct()
  ftrace: Add modify_ftrace_direct()
  tracing: Add missing "inline" in stub function of latency_fsnotify()
  tracing: Remove stray tab in TRACE_EVAL_MAP_FILE's help text
  tracing: Use seq_buf_hex_dump() to dump buffers
  ...
2019-11-27 11:42:01 -08:00
Linus Torvalds
6a0e20cd8c Merge tag 'riscv/for-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Paul Walmsley:
 "New features:
   - SECCOMP support
   - nommu support
   - SBI-less system support
   - M-Mode support
   - TLB flush optimizations

  Other improvements:
   - Pass the complete RISC-V ISA string supported by the CPU cores to
     userspace, rather than redacting parts of it in the kernel
   - Add platform DMA IP block data to the HiFive Unleashed board DT
     file
   - Add Makefile support for BZ2, LZ4, LZMA, LZO kernel image
     compression formats, in line with other architectures

  Cleanups:
   - Remove unnecessary PTE_PARENT_SIZE macro
   - Standardize include guard naming across arch/riscv"

* tag 'riscv/for-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (22 commits)
  riscv: provide a flat image loader
  riscv: add nommu support
  riscv: clear the instruction cache and all registers when booting
  riscv: read the hart ID from mhartid on boot
  riscv: provide native clint access for M-mode
  riscv: dts: add support for PDMA device of HiFive Unleashed Rev A00
  riscv: add support for MMIO access to the timer registers
  riscv: implement remote sfence.i using IPIs
  riscv: cleanup the default power off implementation
  riscv: poison SBI calls for M-mode
  riscv: don't allow selecting SBI based drivers for M-mode
  RISC-V: Add multiple compression image format.
  riscv: clean up the macro format in each header file
  riscv: Use PMD_SIZE to replace PTE_PARENT_SIZE
  riscv: abstract out CSR names for supervisor vs machine mode
  riscv: separate MMIO functions into their own header file
  riscv: enter WFI in default_power_off() if SBI does not shutdown
  RISC-V: Issue a tlb page flush if possible
  RISC-V: Issue a local tlbflush if possible.
  RISC-V: Do not invoke SBI call if cpumask is empty
  ...
2019-11-27 11:27:59 -08:00
Linus Torvalds
0dd09bc02c Merge tag 'staging-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging / iio updates from Greg KH:
 "Here is the big staging and iio set of patches for the 5.5-rc1
  release.

  It's the usual huge collection of cleanup patches all over the
  drivers/staging/ area, along with a new staging driver, and a bunch of
  new IIO drivers as well.

  Full details are in the shortlog, but all of these have been in
  linux-next for a long time with no reported issues"

* tag 'staging-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (548 commits)
  staging: vchiq: Have vchiq_dump_* functions return an error code
  staging: vchiq: Refactor indentation in vchiq_dump_* functions
  staging: fwserial: Fix Kconfig indentation (seven spaces)
  staging: vchiq_dump: Replace min with min_t
  staging: vchiq: Fix block comment format in vchiq_dump()
  staging: octeon: indent with tabs instead of spaces
  staging: comedi: usbduxfast: usbduxfast_ai_cmdtest rounding error
  staging: most: core: remove sysfs attr remove_link
  staging: vc04: Fix Kconfig indentation
  staging: pi433: Fix Kconfig indentation
  staging: nvec: Fix Kconfig indentation
  staging: most: Fix Kconfig indentation
  staging: fwserial: Fix Kconfig indentation
  staging: fbtft: Fix Kconfig indentation
  fbtft: Drop OF dependency
  fbtft: Make use of device property API
  fbtft: Drop useless #ifdef CONFIG_OF and dead code
  fbtft: Describe function parameters in kernel-doc
  fbtft: Make sure string is NULL terminated
  staging: rtl8723bs: remove set but not used variable 'change', 'pos'
  ...
2019-11-27 10:57:52 -08:00
Linus Torvalds
59274c7164 Merge tag 'usb-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB updates from Greg KH:
 "Here is the big set of USB patches for 5.5-rc1

  Lots of little things in here:
   - typec updates and additions
   - usb-serial drivers cleanups and fixes
   - misc USB drivers cleanups and fixes
   - gadget drivers new features and controllers added
   - usual xhci additions
   - other minor changes

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (223 commits)
  usb: gadget: udc: gr_udc: create debugfs directory under usb root
  usb: gadget: atmel: create debugfs directory under usb root
  usb: musb: create debugfs directory under usb root
  usb: serial: Fix Kconfig indentation
  usb: misc: Fix Kconfig indentation
  usb: gadget: Fix Kconfig indentation
  usb: host: Fix Kconfig indentation
  usb: dwc3: Fix Kconfig indentation
  usb: gadget: configfs: Fix missing spin_lock_init()
  usb-storage: Disable UAS on JMicron SATA enclosure
  USB: documentation: flags on usb-storage versus UAS
  USB: uas: heed CAPACITY_HEURISTICS
  USB: uas: honor flag to avoid CAPACITY16
  usb: host: xhci-tegra: Correct phy enable sequence
  usb-serial: cp201x: support Mark-10 digital force gauge
  usb: chipidea: imx: pinctrl for HSIC is optional
  usb: chipidea: imx: refine the error handling for hsic
  usb: chipidea: imx: change hsic power regulator as optional
  usb: chipidea: imx: check data->usbmisc_data against NULL before access
  usb: chipidea: core: change vbus-regulator as optional
  ...
2019-11-27 10:46:34 -08:00
Linus Torvalds
9e7a03233e Merge tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
 "These include cpuidle changes to use nanoseconds (instead of
  microseconds) as the unit of time and to simplify checks for disabled
  idle states in the idle loop, some cpuidle fixes and governor updates,
  assorted cpufreq updates (driver updates mostly and a few core fixes
  and cleanups), devfreq updates (dominated by the tegra30 driver
  changes), new CPU IDs for the RAPL power capping driver, relatively
  minor updates of the generic power domains (genpd) and operation
  performance points (OPP) frameworks, and assorted fixes and cleanups.

  There are also two maintainer information updates: Chanwoo Choi will
  be maintaining the devfreq subsystem going forward and Todd Brandt is
  going to maintain the pm-graph utility (created by him).

  Specifics:

   - Use nanoseconds (instead of microseconds) as the unit of time in
     the cpuidle core and simplify checks for disabled idle states in
     the idle loop (Rafael Wysocki)

   - Fix and clean up the teo cpuidle governor (Rafael Wysocki)

   - Fix the cpuidle registration error code path (Zhenzhong Duan)

   - Avoid excessive vmexits in the ACPI cpuidle driver (Yin Fengwei)

   - Extend the idle injection infrastructure to be able to measure the
     requested duration in nanoseconds and to allow an exit latency
     limit for idle states to be specified (Daniel Lezcano)

   - Fix cpufreq driver registration and clarify a comment in the
     cpufreq core (Viresh Kumar)

   - Add NULL checks to the show() and store() methods of sysfs
     attributes exposed by cpufreq (Kai Shen)

   - Update cpufreq drivers:
      * Fix for a plain int as pointer warning from sparse in
        intel_pstate (Jamal Shareef)
      * Fix for a hardcoded number of CPUs and stack bloat in the
        powernv driver (John Hubbard)
      * Updates to the ti-cpufreq driver and DT files to support new
        platforms and migrate bindings from opp-v1 to opp-v2 (Adam Ford,
        H. Nikolaus Schaller)
      * Merging of the arm_big_little and vexpress-spc drivers and
        related cleanup (Sudeep Holla)
      * Fix for imx's default speed grade value (Anson Huang)
      * Minor cleanup of the s3c64xx driver (Nathan Chancellor)
      * CPU speed bin detection fix for sun50i (Ondrej Jirman)

   - Appoint Chanwoo Choi as the new devfreq maintainer.

   - Update the devfreq core:
      * Check NULL governor in available_governors_show sysfs to prevent
        showing wrong governor information and fix a race condition
        between devfreq_update_status() and trans_stat_show() (Leonard
        Crestez)
      * Add new 'interrupt-driven' flag for devfreq governors to allow
        interrupt-driven governors to prevent the devfreq core from
        polling devices for status (Dmitry Osipenko)
      * Improve an error message in devfreq_add_device() (Matthias
        Kaehlcke)

   - Update devfreq drivers:
      * tegra30 driver fixes and cleanups (Dmitry Osipenko)
      * Removal of unused property from dt-binding documentation for the
        exynos-bus driver (Kamil Konieczny)
      * exynos-ppmu cleanup and DT bindings update (Lukasz Luba, Marek
        Szyprowski)

   - Add new CPU IDs for CometLake Mobile and Desktop to the Intel RAPL
     power capping driver (Zhang Rui)

   - Allow device initialization in the generic power domains (genpd)
     framework to be more straightforward and clean it up (Ulf Hansson)

   - Add support for adjusting OPP voltages at run time to the OPP
     framework (Stephen Boyd)

   - Avoid freeing memory that has never been allocated in the
     hibernation core (Andy Whitcroft)

   - Clean up function headers in a header file and coding style in the
     wakeup IRQs handling code (Ulf Hansson, Xiaofei Tan)

   - Clean up the SmartReflex adaptive voltage scaling (AVS) driver for
     ARM (Ben Dooks, Geert Uytterhoeven)

   - Wrap power management documentation to fit in 80 columns (Bjorn
     Helgaas)

   - Add pm-graph utility entry to MAINTAINERS (Todd Brandt)

   - Update the cpupower utility:
      * Fix the handling of set and info subcommands (Abhishek Goel)
      * Fix build warnings (Nathan Chancellor)
      * Improve mperf_monitor handling (Janakarajan Natarajan)"

* tag 'pm-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (83 commits)
  PM: Wrap documentation to fit in 80 columns
  cpuidle: Pass exit latency limit to cpuidle_use_deepest_state()
  cpuidle: Allow idle injection to apply exit latency limit
  cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks
  cpuidle: teo: Avoid code duplication in conditionals
  cpufreq: Register drivers only after CPU devices have been registered
  cpuidle: teo: Avoid using "early hits" incorrectly
  cpuidle: teo: Exclude cpuidle overhead from computations
  PM / Domains: Convert to dev_to_genpd_safe() in genpd_syscore_switch()
  mmc: tmio: Avoid boilerplate code in ->runtime_suspend()
  PM / Domains: Implement the ->start() callback for genpd
  PM / Domains: Introduce dev_pm_domain_start()
  ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition
  PM / wakeirq: remove unnecessary parentheses
  power: avs: smartreflex: Remove superfluous cast in debugfs_create_file() call
  cpuidle: Use nanoseconds as the unit of time
  PM / OPP: Support adjusting OPP voltages at runtime
  PM / core: Clean up some function headers in power.h
  cpufreq: Add NULL checks to show() and store() methods of cpufreq
  cpufreq: intel_pstate: Fix plain int as pointer warning from sparse
  ...
2019-11-26 19:06:44 -08:00
Linus Torvalds
168829ad09 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes in this cycle were:

   - A comprehensive rewrite of the robust/PI futex code's exit handling
     to fix various exit races. (Thomas Gleixner et al)

   - Rework the generic REFCOUNT_FULL implementation using
     atomic_fetch_* operations so that the performance impact of the
     cmpxchg() loops is mitigated for common refcount operations.

     With these performance improvements the generic implementation of
     refcount_t should be good enough for everybody - and this got
     confirmed by performance testing, so remove ARCH_HAS_REFCOUNT and
     REFCOUNT_FULL entirely, leaving the generic implementation enabled
     unconditionally. (Will Deacon)

   - Other misc changes, fixes, cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  lkdtm: Remove references to CONFIG_REFCOUNT_FULL
  locking/refcount: Remove unused 'refcount_error_report()' function
  locking/refcount: Consolidate implementations of refcount_t
  locking/refcount: Consolidate REFCOUNT_{MAX,SATURATED} definitions
  locking/refcount: Move saturation warnings out of line
  locking/refcount: Improve performance of generic REFCOUNT_FULL code
  locking/refcount: Move the bulk of the REFCOUNT_FULL implementation into the <linux/refcount.h> header
  locking/refcount: Remove unused refcount_*_checked() variants
  locking/refcount: Ensure integer operands are treated as signed
  locking/refcount: Define constants for saturation and max refcount values
  futex: Prevent exit livelock
  futex: Provide distinct return value when owner is exiting
  futex: Add mutex around futex exit
  futex: Provide state handling for exec() as well
  futex: Sanitize exit state handling
  futex: Mark the begin of futex exit explicitly
  futex: Set task::futex_state to DEAD right after handling futex exit
  futex: Split futex_mm_release() for exit/exec
  exit/exec: Seperate mm_release()
  futex: Replace PF_EXITPIDONE with a state
  ...
2019-11-26 16:02:40 -08:00
Linus Torvalds
1ae78780ed Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Dynamic tick (nohz) updates, perhaps most notably changes to force
     the tick on when needed due to lengthy in-kernel execution on CPUs
     on which RCU is waiting.

   - Linux-kernel memory consistency model updates.

   - Replace rcu_swap_protected() with rcu_prepace_pointer().

   - Torture-test updates.

   - Documentation updates.

   - Miscellaneous fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer()
  net/sched: Replace rcu_swap_protected() with rcu_replace_pointer()
  net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer()
  net/core: Replace rcu_swap_protected() with rcu_replace_pointer()
  bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer()
  fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer()
  drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer()
  drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer()
  x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer()
  rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer()
  rcu: Suppress levelspread uninitialized messages
  rcu: Fix uninitialized variable in nocb_gp_wait()
  rcu: Update descriptions for rcu_future_grace_period tracepoint
  rcu: Update descriptions for rcu_nocb_wake tracepoint
  rcu: Remove obsolete descriptions for rcu_barrier tracepoint
  rcu: Ensure that ->rcu_urgent_qs is set before resched IPI
  workqueue: Convert for_each_wq to use built-in list check
  rcu: Several rcu_segcblist functions can be static
  rcu: Remove unused function hlist_bl_del_init_rcu()
  Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch()
  ...
2019-11-26 15:42:43 -08:00
Linus Torvalds
3f59dbcace Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "The main kernel side changes in this cycle were:

   - Various Intel-PT updates and optimizations (Alexander Shishkin)

   - Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu)

   - Add support for LSM and SELinux checks to control access to the
     perf syscall (Joel Fernandes)

   - Misc other changes, optimizations, fixes and cleanups - see the
     shortlog for details.

  There were numerous tooling changes as well - 254 non-merge commits.
  Here are the main changes - too many to list in detail:

   - Enhancements to core tooling infrastructure, perf.data, libperf,
     libtraceevent, event parsing, vendor events, Intel PT, callchains,
     BPF support and instruction decoding.

   - There were updates to the following tools:

        perf annotate
        perf diff
        perf inject
        perf kvm
        perf list
        perf maps
        perf parse
        perf probe
        perf record
        perf report
        perf script
        perf stat
        perf test
        perf trace

   - And a lot of other changes: please see the shortlog and Git log for
     more details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits)
  perf parse: Fix potential memory leak when handling tracepoint errors
  perf probe: Fix spelling mistake "addrees" -> "address"
  libtraceevent: Fix memory leakage in copy_filter_type
  libtraceevent: Fix header installation
  perf intel-bts: Does not support AUX area sampling
  perf intel-pt: Add support for decoding AUX area samples
  perf intel-pt: Add support for recording AUX area samples
  perf pmu: When using default config, record which bits of config were changed by the user
  perf auxtrace: Add support for queuing AUX area samples
  perf session: Add facility to peek at all events
  perf auxtrace: Add support for dumping AUX area samples
  perf inject: Cut AUX area samples
  perf record: Add aux-sample-size config term
  perf record: Add support for AUX area sampling
  perf auxtrace: Add support for AUX area sample recording
  perf auxtrace: Move perf_evsel__find_pmu()
  perf record: Add a function to test for kernel support for AUX area sampling
  perf tools: Add kernel AUX area sampling definitions
  perf/core: Make the mlock accounting simple again
  perf report: Jump to symbol source view from total cycles view
  ...
2019-11-26 15:04:47 -08:00
Andy Lutomirski
3300c4f3af selftests/x86/single_step_syscall: Check SYSENTER directly
We used to test SYSENTER only through the vDSO.  Test it directly
too, just in case.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-11-26 21:53:34 +01:00