Commit Graph

826113 Commits

Author SHA1 Message Date
Vasily Averin
a10674bf24 tcp: detecting the misuse of .sendpage for Slab objects
sendpage was not designed for processing of the Slab pages,
in some situations it can trigger BUG_ON on receiving side.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-06 10:48:31 -08:00
Michael J. Ruhl
bc5add0976 IB/hfi1: Close race condition on user context disable and close
When disabling and removing a receive context, it is possible for an
asynchronous event (i.e IRQ) to occur.  Because of this, there is a race
between cleaning up the context, and the context being used by the
asynchronous event.

cpu 0  (context cleanup)
    rc->ref_count-- (ref_count == 0)
    hfi1_rcd_free()
cpu 1  (IRQ (with rcd index))
	rcd_get_by_index()
	lock
	ref_count+++     <-- reference count race (WARNING)
	return rcd
	unlock
cpu 0
    hfi1_free_ctxtdata() <-- incorrect free location
    lock
    remove rcd from array
    unlock
    free rcd

This race will cause the following WARNING trace:

WARNING: CPU: 0 PID: 175027 at include/linux/kref.h:52 hfi1_rcd_get_by_index+0x84/0xa0 [hfi1]
CPU: 0 PID: 175027 Comm: IMB-MPI1 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015
Call Trace:
  dump_stack+0x19/0x1b
  __warn+0xd8/0x100
  warn_slowpath_null+0x1d/0x20
  hfi1_rcd_get_by_index+0x84/0xa0 [hfi1]
  is_rcv_urgent_int+0x24/0x90 [hfi1]
  general_interrupt+0x1b6/0x210 [hfi1]
  __handle_irq_event_percpu+0x44/0x1c0
  handle_irq_event_percpu+0x32/0x80
  handle_irq_event+0x3c/0x60
  handle_edge_irq+0x7f/0x150
  handle_irq+0xe4/0x1a0
  do_IRQ+0x4d/0xf0
  common_interrupt+0x162/0x162

The race can also lead to a use after free which could be similar to:

general protection fault: 0000 1 SMP
CPU: 71 PID: 177147 Comm: IMB-MPI1 Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0076.C4.111920150602 11/19/2015
task: ffff9962a8098000 ti: ffff99717a508000 task.ti: ffff99717a508000 __kmalloc+0x94/0x230
Call Trace:
  ? hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1]
  hfi1_user_sdma_process_request+0x9c8/0x1250 [hfi1]
  hfi1_aio_write+0xba/0x110 [hfi1]
  do_sync_readv_writev+0x7b/0xd0
  do_readv_writev+0xce/0x260
  ? handle_mm_fault+0x39d/0x9b0
  ? pick_next_task_fair+0x5f/0x1b0
  ? sched_clock_cpu+0x85/0xc0
  ? __schedule+0x13a/0x890
  vfs_writev+0x35/0x60
  SyS_writev+0x7f/0x110
  system_call_fastpath+0x22/0x27

Use the appropriate kref API to verify access.

Reorder context cleanup to ensure context removal before cleanup occurs
correctly.

Cc: stable@vger.kernel.org # v4.14.0+
Fixes: f683c80ca6 ("IB/hfi1: Resolve kernel panics by reference counting receive contexts")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-06 14:47:09 -04:00
Arnd Bergmann
7b83762376 appletalk: Add atalk.h header files to MAINTAINERS file
Add the path names here so that git-send-email can pick up the
netdev@vger.kernel.org Cc line automatically for a patch that
only touches the headers.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-06 10:46:43 -08:00
Arnd Bergmann
27da0d2ef9 appletalk: Fix compile regression
A bugfix just broke compilation of appletalk when CONFIG_SYSCTL
is disabled:

In file included from net/appletalk/ddp.c:65:
net/appletalk/ddp.c: In function 'atalk_init':
include/linux/atalk.h:164:34: error: expected expression before 'do'
 #define atalk_register_sysctl()  do { } while(0)
                                  ^~
net/appletalk/ddp.c:1934:7: note: in expansion of macro 'atalk_register_sysctl'
  rc = atalk_register_sysctl();

This is easier to avoid by using conventional inline functions
as stubs rather than macros. The header already has inline
functions for other purposes, so I'm changing over all the
macros for consistency.

Fixes: 6377f787ae ("appletalk: Fix use-after-free in atalk_proc_exit")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-06 10:46:29 -08:00
Alan Maguire
f4b3ec4e6a iptunnel: NULL pointer deref for ip_md_tunnel_xmit
Naresh Kamboju noted the following oops during execution of selftest
tools/testing/selftests/bpf/test_tunnel.sh on x86_64:

[  274.120445] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000000
[  274.128285] #PF error: [INSTR]
[  274.131351] PGD 8000000414a0e067 P4D 8000000414a0e067 PUD 3b6334067 PMD 0
[  274.138241] Oops: 0010 [#1] SMP PTI
[  274.141734] CPU: 1 PID: 11464 Comm: ping Not tainted
5.0.0-rc4-next-20190129 #1
[  274.149046] Hardware name: Supermicro SYS-5019S-ML/X11SSH-F, BIOS
2.0b 07/27/2017
[  274.156526] RIP: 0010:          (null)
[  274.160280] Code: Bad RIP value.
[  274.163509] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
[  274.168726] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
[  274.175851] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
[  274.182974] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
[  274.190098] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
[  274.197222] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
[  274.204346] FS:  00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
knlGS:0000000000000000
[  274.212424] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  274.218162] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
[  274.225292] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  274.232416] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  274.239541] Call Trace:
[  274.241988]  ? tnl_update_pmtu+0x296/0x3b0
[  274.246085]  ip_md_tunnel_xmit+0x1bc/0x520
[  274.250176]  gre_fb_xmit+0x330/0x390
[  274.253754]  gre_tap_xmit+0x128/0x180
[  274.257414]  dev_hard_start_xmit+0xb7/0x300
[  274.261598]  sch_direct_xmit+0xf6/0x290
[  274.265430]  __qdisc_run+0x15d/0x5e0
[  274.269007]  __dev_queue_xmit+0x2c5/0xc00
[  274.273011]  ? dev_queue_xmit+0x10/0x20
[  274.276842]  ? eth_header+0x2b/0xc0
[  274.280326]  dev_queue_xmit+0x10/0x20
[  274.283984]  ? dev_queue_xmit+0x10/0x20
[  274.287813]  arp_xmit+0x1a/0xf0
[  274.290952]  arp_send_dst.part.19+0x46/0x60
[  274.295138]  arp_solicit+0x177/0x6b0
[  274.298708]  ? mod_timer+0x18e/0x440
[  274.302281]  neigh_probe+0x57/0x70
[  274.305684]  __neigh_event_send+0x197/0x2d0
[  274.309862]  neigh_resolve_output+0x18c/0x210
[  274.314212]  ip_finish_output2+0x257/0x690
[  274.318304]  ip_finish_output+0x219/0x340
[  274.322314]  ? ip_finish_output+0x219/0x340
[  274.326493]  ip_output+0x76/0x240
[  274.329805]  ? ip_fragment.constprop.53+0x80/0x80
[  274.334510]  ip_local_out+0x3f/0x70
[  274.337992]  ip_send_skb+0x19/0x40
[  274.341391]  ip_push_pending_frames+0x33/0x40
[  274.345740]  raw_sendmsg+0xc15/0x11d0
[  274.349403]  ? __might_fault+0x85/0x90
[  274.353151]  ? _copy_from_user+0x6b/0xa0
[  274.357070]  ? rw_copy_check_uvector+0x54/0x130
[  274.361604]  inet_sendmsg+0x42/0x1c0
[  274.365179]  ? inet_sendmsg+0x42/0x1c0
[  274.368937]  sock_sendmsg+0x3e/0x50
[  274.372460]  ___sys_sendmsg+0x26f/0x2d0
[  274.376293]  ? lock_acquire+0x95/0x190
[  274.380043]  ? __handle_mm_fault+0x7ce/0xb70
[  274.384307]  ? lock_acquire+0x95/0x190
[  274.388053]  ? __audit_syscall_entry+0xdd/0x130
[  274.392586]  ? ktime_get_coarse_real_ts64+0x64/0xc0
[  274.397461]  ? __audit_syscall_entry+0xdd/0x130
[  274.401989]  ? trace_hardirqs_on+0x4c/0x100
[  274.406173]  __sys_sendmsg+0x63/0xa0
[  274.409744]  ? __sys_sendmsg+0x63/0xa0
[  274.413488]  __x64_sys_sendmsg+0x1f/0x30
[  274.417405]  do_syscall_64+0x55/0x190
[  274.421064]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  274.426113] RIP: 0033:0x7ff4ae0e6e87
[  274.429686] Code: 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 80 00
00 00 00 8b 05 ca d9 2b 00 48 63 d2 48 63 ff 85 c0 75 10 b8 2e 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 53 48 89 f3 48 83 ec 10 48 89 7c
24 08
[  274.448422] RSP: 002b:00007ffcd9b76db8 EFLAGS: 00000246 ORIG_RAX:
000000000000002e
[  274.455978] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 00007ff4ae0e6e87
[  274.463104] RDX: 0000000000000000 RSI: 00000000006092e0 RDI: 0000000000000003
[  274.470228] RBP: 0000000000000000 R08: 00007ffcd9bc40a0 R09: 00007ffcd9bc4080
[  274.477349] R10: 000000000000060a R11: 0000000000000246 R12: 0000000000000003
[  274.484475] R13: 0000000000000016 R14: 00007ffcd9b77fa0 R15: 00007ffcd9b78da4
[  274.491602] Modules linked in: cls_bpf sch_ingress iptable_filter
ip_tables algif_hash af_alg x86_pkg_temp_thermal fuse [last unloaded:
test_bpf]
[  274.504634] CR2: 0000000000000000
[  274.507976] ---[ end trace 196d18386545eae1 ]---
[  274.512588] RIP: 0010:          (null)
[  274.516334] Code: Bad RIP value.
[  274.519557] RSP: 0018:ffffbc9681f83540 EFLAGS: 00010286
[  274.524775] RAX: 0000000000000000 RBX: ffffdc967fa80a18 RCX: 0000000000000000
[  274.531921] RDX: ffff9db2ee08b540 RSI: 000000000000000e RDI: ffffdc967fa809a0
[  274.539082] RBP: ffffbc9681f83580 R08: ffff9db2c4d62690 R09: 000000000000000c
[  274.546205] R10: 0000000000000000 R11: ffff9db2ee08b540 R12: ffff9db31ce7c000
[  274.553329] R13: 0000000000000001 R14: 000000000000000c R15: ffff9db3179cf400
[  274.560456] FS:  00007ff4ae7c5740(0000) GS:ffff9db31fa80000(0000)
knlGS:0000000000000000
[  274.568541] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  274.574277] CR2: ffffffffffffffd6 CR3: 00000004574da004 CR4: 00000000003606e0
[  274.581403] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  274.588535] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  274.595658] Kernel panic - not syncing: Fatal exception in interrupt
[  274.602046] Kernel Offset: 0x14400000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  274.612827] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---
[  274.620387] ------------[ cut here ]------------

I'm also seeing the same failure on x86_64, and it reproduces
consistently.

>From poking around it looks like the skb's dst entry is being used
to calculate the mtu in:

mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;

...but because that dst_entry  has an "ops" value set to md_dst_ops,
the various ops (including mtu) are not set:

crash> struct sk_buff._skb_refdst ffff928f87447700 -x
      _skb_refdst = 0xffffcd6fbf5ea590
crash> struct dst_entry.ops 0xffffcd6fbf5ea590
  ops = 0xffffffffa0193800
crash> struct dst_ops.mtu 0xffffffffa0193800
  mtu = 0x0
crash>

I confirmed that the dst entry also has dst->input set to
dst_md_discard, so it looks like it's an entry that's been
initialized via __metadata_dst_init alright.

I think the fix here is to use skb_valid_dst(skb) - it checks
for  DST_METADATA also, and with that fix in place, the
problem - which was previously 100% reproducible - disappears.

The below patch resolves the panic and all bpf tunnel tests pass
without incident.

Fixes: c8b34e680a ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-06 10:43:06 -08:00
John Hubbard
0c507d8f84 RDMA/umem: Revert broken 'off by one' fix
The previous attempted bug fix overlooked the fact that
ib_umem_odp_map_dma_single_page() was doing a put_page() upon hitting an
error. So there was not really a bug there.

Therefore, this reverts the off-by-one change, but keeps the change to use
release_pages() in the error path.

Fixes: 75a3e6a3c1 ("RDMA/umem: minor bug fix in error handling path")
Suggested-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-06 14:42:37 -04:00
Linus Torvalds
8dcd175bc3 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:

 - a few misc things

 - ocfs2 updates

 - most of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
  tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
  proc: more robust bulk read test
  proc: test /proc/*/maps, smaps, smaps_rollup, statm
  proc: use seq_puts() everywhere
  proc: read kernel cpu stat pointer once
  proc: remove unused argument in proc_pid_lookup()
  fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
  fs/proc/self.c: code cleanup for proc_setup_self()
  proc: return exit code 4 for skipped tests
  mm,mremap: bail out earlier in mremap_to under map pressure
  mm/sparse: fix a bad comparison
  mm/memory.c: do_fault: avoid usage of stale vm_area_struct
  writeback: fix inode cgroup switching comment
  mm/huge_memory.c: fix "orig_pud" set but not used
  mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
  mm/memcontrol.c: fix bad line in comment
  mm/cma.c: cma_declare_contiguous: correct err handling
  mm/page_ext.c: fix an imbalance with kmemleak
  mm/compaction: pass pgdat to too_many_isolated() instead of zone
  mm: remove zone_lru_lock() function, access ->lru_lock directly
  ...
2019-03-06 10:31:36 -08:00
Paolo Abeni
22c74764aa ipv4/route: fail early when inet dev is missing
If a non local multicast packet reaches ip_route_input_rcu() while
the ingress device IPv4 private data (in_dev) is NULL, we end up
doing a NULL pointer dereference in IN_DEV_MFORWARD().

Since the later call to ip_route_input_mc() is going to fail if
!in_dev, we can fail early in such scenario and avoid the dangerous
code path.

v1 -> v2:
 - clarified the commit message, no code changes

Reported-by: Tianhao Zhao <tizhao@redhat.com>
Fixes: e58e415968 ("net: Enable support for VRF with ipv4 multicast")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-06 10:23:18 -08:00
Linus Torvalds
afe6fe7036 Merge tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC late updates from Arnd Bergmann:
 "Here are two branches that came relatively late during the linux-5.0
  development cycle and have dependencies on the other branches:

   - On the TI OMAP platform, the CPSW Ethernet PHY mode selection
     driver is being replaced, this puts the final pieces in place

   - On the DaVinci platform, the interrupt handling code in arch/arm
     gets moved into a regular device driver in drivers/irqchip.

  Since they both had some time in linux-next after the 5.0-rc8 release,
  I'm sending them along with the other updates"

* tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (38 commits)
  net: ethernet: ti: cpsw: deprecate cpsw-phy-sel driver
  ARM: davinci: remove intc related fields from davinci_soc_info
  irqchip: davinci-cp-intc: move the driver to drivers/irqchip
  ARM: davinci: cp-intc: remove redundant comments
  ARM: davinci: cp-intc: drop GPL license boilerplate
  ARM: davinci: cp-intc: use readl/writel_relaxed()
  ARM: davinci: cp-intc: unify error handling
  ARM: davinci: cp-intc: improve coding style
  ARM: davinci: cp-intc: request the memory region before remapping it
  ARM: davinci: cp-intc: use the new-style config structure
  ARM: davinci: cp-intc: convert all hex numbers to lowercase
  ARM: davinci: cp-intc: use a common prefix for all symbols
  ARM: davinci: cp-intc: add the new config structures for da8xx SoCs
  irqchip: davinci-cp-intc: add a new config structure
  ARM: davinci: cp-intc: add a wrapper around cp_intc_init()
  ARM: davinci: cp-intc: remove cp_intc.h
  irqchip: davinci-aintc: move the driver to drivers/irqchip
  ARM: davinci: aintc: remove unnecessary includes
  ARM: davinci: aintc: remove the timer-specific irq_set_handler()
  ARM: davinci: aintc: request memory region before remapping it
  ...
2019-03-06 10:22:26 -08:00
Dan Carpenter
f4772dee10 net: hns3: Fix a logical vs bitwise typo
There were a couple logical ORs accidentally mixed in with the bitwise
ORs.

Fixes: e8149933b1 ("net: hns3: remove hnae3_get_bit in data path")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-06 10:22:16 -08:00
Linus Torvalds
64b1b217f1 Merge tag 'armsoc-newsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM new SoC family support from Arnd Bergmann:
 "Two new SoC families are added this time.

  Sugaya Taichi submitted support for the Milbeaut SoC family from
  Socionext and explains:

    "SC2000 is a SoC of the Milbeaut series. equipped with a DSP
     optimized for computer vision. It also features advanced
     functionalities such as 360-degree, real-time spherical stitching
     with multi cameras, image stabilization for without mechanical
     gimbals, and rolling shutter correction. More detail is below:

       https://www.socionext.com/en/products/assp/milbeaut/SC2000.html"

  Interestingly, this one has a history dating back to older chips made
  by Socionext and previously Matsushita/Panasonic based on their own
  mn10300 CPU architecture that was removed from the kernel last year.

  Manivannan Sadhasivam adds support for another SoC family, this is the
  Bitmain BM1880 chip used in the Sophon Edge TPU developer board.

  The chip is intended for Deep Learning applications, and comes with
  dual-core Arm Cortex-A53 to run Linux as well as a RISC-V
  microcontroller core to control the tensor unit. For the moment, the
  TPU is not accessible in mainline Linux, so we treat it as a generic
  Arm SoC.

  More information is available at

       https://www.sophon.ai/"

* tag 'armsoc-newsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: multi_v7_defconfig: add ARCH_MILBEAUT and ARCH_MILBEAUT_M10V
  ARM: configs: Add Milbeaut M10V defconfig
  ARM: dts: milbeaut: Add device tree set for the Milbeaut M10V board
  clocksource/drivers/timer-milbeaut: Introduce timer for Milbeaut SoCs
  dt-bindings: timer: Add Milbeaut M10V timer description
  ARM: milbeaut: Add basic support for Milbeaut m10v SoC
  dt-bindings: Add documentation for Milbeaut SoCs
  dt-bindings: arm: Add SMP enable-method for Milbeaut
  dt-bindings: sram: milbeaut: Add binding for Milbeaut smp-sram
  MAINTAINERS: Add entry for Bitmain SoC platform
  arm64: dts: bitmain: Add Sophon Egde board support
  arm64: dts: bitmain: Add BM1880 SoC support
  arm64: Add ARCH_BITMAIN platform
  dt-bindings: arm: Document Bitmain BM1880 SoC
2019-03-06 10:15:42 -08:00
Linus Torvalds
fb686ad25b Merge tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC defconfig updates from Arnd Bergmann:
 "We regenerated the defconfig files for samsung, shmobile, lpc18xx,
  lpc32xx, omap2, and nhk8815.

  Lots of additional drivers added on samsung and nhk8815, as well as
  the new pl110 driver on all machines that have it.

  The remaining changes are mostly to enable newly added drivers, and in
  case of imx8mq together with the SoC getting merged"

* tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (47 commits)
  ARM: spear3xx_defconfig: Activate PL111 DRM driver
  ARM: nhk8815_defconfig: Add new options
  ARM: nhk8815_defconfig: Update defconfig
  ARM: pxa: remove CONFIG_SND_PXA2XX_AC97 in pxa_defconfig
  ARM: defconfig: integrator: Switch to DRM
  arm64: defconfig: Add IMX2+ watchdog
  arm64: defconfig: Enable PFUZE100 regulator
  arm64: defconfig: enable NXP FlexSPI driver
  arm64: defconfig: Add i.MX8MQ boot necessary configs
  arm64: defconfig: add imx8qxp support
  arm64: defconfig: add i.MX system controller RTC support
  arm64: defconfig: Enable Tegra TCU
  arm64: defconfig: Enable MAX8973 regulator
  ARM: socfpga_defconfig: enable BLK_DEV_LOOP config option
  ARM: defconfig: lpc32xx: enable DRM simple panel driver
  ARM: defconfig: lpc32xx: enable fixed voltage regulator support
  arm64: defconfig: Enable SUN6I Camera sensor interface
  arm64: defconfig: Enable I2C_GPIO
  ARM: omap2plus_defconfig: Update for moved options
  ARM: omap2plus_defconfig: Update for dropped options
  ...
2019-03-06 10:09:50 -08:00
Linus Torvalds
384d11fa0e Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC driver updates from Arnd Bergmann:
 "As usual, the drivers/tee and drivers/reset subsystems get merged
  here, with the expected set of smaller updates and some new hardware
  support. The tee subsystem now supports device drivers to be attached
  to a tee, the first example here is a random number driver with its
  implementation in the secure world.

  Three new power domain drivers get added for specific chip families:
   - Broadcom BCM283x chips (used in Raspberry Pi)
   - Qualcomm Snapdragon phone chips
   - Xilinx ZynqMP FPGA SoCs

  One new driver is added to talk to the BPMP firmware on NVIDIA
  Tegra210

  Existing drivers are extended for new SoC variants from NXP, NVIDIA,
  Amlogic and Qualcomm"

* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (113 commits)
  tee: optee: update optee_msg.h and optee_smc.h to dual license
  tee: add cancellation support to client interface
  dpaa2-eth: configure the cache stashing amount on a queue
  soc: fsl: dpio: configure cache stashing destination
  soc: fsl: dpio: enable frame data cache stashing per software portal
  soc: fsl: guts: make fsl_guts_get_svr() static
  hwrng: make symbol 'optee_rng_id_table' static
  tee: optee: Fix unsigned comparison with less than zero
  hwrng: Fix unsigned comparison with less than zero
  tee: fix possible error pointer ctx dereferencing
  hwrng: optee: Initialize some structs using memset instead of braces
  tee: optee: Initialize some structs using memset instead of braces
  soc: fsl: dpio: fix memory leak of a struct qbman on error exit path
  clk: tegra: dfll: Make symbol 'tegra210_cpu_cvb_tables' static
  soc: qcom: llcc-slice: Fix typos
  qcom: soc: llcc-slice: Consolidate some code
  qcom: soc: llcc-slice: Clear the global drv_data pointer on error
  drivers: soc: xilinx: Add ZynqMP power domain driver
  firmware: xilinx: Add APIs to control node status/power
  dt-bindings: power: Add ZynqMP power domain bindings
  ...
2019-03-06 09:41:12 -08:00
Linus Torvalds
6ad63dec9c Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC device tree updates from Arnd Bergmann:
 "This is a smaller update than the past few times, but with just over
  500 non-merge changesets still dwarfes the rest of the SoC tree.

  Three new SoC platforms get added, each one a follow-up to an existing
  product, and added here in combination with a reference platform:

   - Renesas RZ/A2M (R7S9210) 32-bit Cortex-A9 Real-time imaging
     processor:

       https://www.renesas.com/eu/en/products/microcontrollers-microprocessors/rz/rza/rza2m.html

   - Renesas RZ/G2E (r8a774c0) 64-bit Cortex-A53 SoC "for Rich Graphics
     Applications":

       https://www.renesas.com/eu/en/products/microcontrollers-microprocessors/rz/rzg/rzg2e.html

   - NXP i.MX8QuadXPlus 64-bit Cortex-A35 SoC:

       https://www.nxp.com/products/processors-and-microcontrollers/arm-based-processors-and-mcus/i.mx-applications-processors/i.mx-8-processors/i.mx-8x-family-arm-cortex-a35-3d-graphics-4k-video-dsp-error-correcting-code-on-ddr:i.MX8X

  These are actual commercial products we now support with an in-kernel
  device tree source file:

   - Bosch Guardian is a product made by Bosch Power Tools GmbH, based
     on the Texas Instruments AM335x chip

   - Winterland IceBoard is a Texas Instruments AM3874 based machine
     used in telescopes at the south pole and elsewhere, see commit
     d031773169 for some pointers:

   - Inspur on5263m5 is an x86 server platform with an Aspeed ast2500
     baseboard management controller. This is for running on the BMC.

   - Zodiac Digital Tapping Unit, apparently a kind of ethernet switch
     used in airplanes.

   - Phicomm K3 is a WiFi router based on Broadcom bcm47094

   - Methode Electronics uDPU FTTdp distribution point unit

   - X96 Max, a generic TV box based on Amlogic G12a (S905X2)

   - NVIDIA Shield TV (Darcy) based on Tegra210

  And then there are several new SBC, evaluation, development or modular
  systems that we add:

   - Three new Rockchips rk3399 based boards:
       - FriendlyElec NanoPC-T4 and NanoPi M4
       - Radxa ROCK Pi 4

   - Five new i.MX6 family SoM modules and boards for industrial
     products:
       - Logic PD i.MX6QD SoM and evaluation baseboad
       - Y Soft IOTA Draco/Hydra/Ursa family boards based on i.MX6DL
       - Phytec phyCORE i.MX6 UltraLite SoM and evaluation module

   - MYIR Tech MYD-LPC4357 development based on the NXP lpc4357
     microcontroller

   - Chameleon96, an Intel/Altera Cyclone5 based FPGA development system
     in 96boards form factor

   - Arm Fixed Virtual Platforms(FVP) Base RevC, a purely virtual
     platform for corresponding to the latest "fast model"

   - Another Raspberry Pi variant: Model 3 A+, supported both in 32-bit
     and 64-bit mode.

   - Oxalis Evalkit V100 based on NXP Layerscape LS1012a, in 96Boards
     enterprise form factor

   - Elgin RV1108 R1 development board based on 32-bit Rockchips RV1108

  For already supported boards and SoCs, we often add support for new
  devices after merging the drivers. This time, the largest changes
  include updates for

   - STMicroelectronics stm32mp1, which was now formally launched last
     week

   - Qualcomm Snapdragon 845, a high-end phone and low-end laptop chip

   - Action Semi S700

   - TI AM654x, their recently merged 64-bit SoC from the OMAP family

   - Various Amlogic Meson SoCs

   - Mediatek MT2712

   - NVIDIA Tegra186 and Tegra210

   - The ancient NXP lpc32xx family

   - Samsung s5pv210, used in some older mobile phones

  Many other chips see smaller updates and bugfixes beyond that"

* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (506 commits)
  ARM: dts: exynos: Fix max voltage for buck8 regulator on Odroid XU3/XU4
  dt-bindings: net: ti: deprecate cpsw-phy-sel bindings
  ARM: dts: am335x: switch to use phy-gmii-sel
  ARM: dts: am4372: switch to use phy-gmii-sel
  ARM: dts: dm814x: switch to use phy-gmii-sel
  ARM: dts: dra7: switch to use phy-gmii-sel
  arch: arm: dts: kirkwood-rd88f6281: Remove disabled marvell,dsa reference
  ARM: dts: exynos: Add support for secondary DAI to Odroid XU4
  ARM: dts: exynos: Add support for secondary DAI to Odroid XU3
  ARM: dts: exynos: Disable ARM PMU on Odroid XU3-lite
  ARM: dts: exynos: Add stdout path property to Arndale board
  ARM: dts: exynos: Add minimal clkout parameters to Exynos3250 PMU
  ARM: dts: exynos: Enable ADC on Odroid HC1
  arm64: dts: sprd: Remove wildcard compatible string
  arm64: dts: sprd: Add SC27XX fuel gauge device
  arm64: dts: sprd: Add SC2731 charger device
  arm64: dts: sprd: Add ADC calibration support
  arm64: dts: sprd: Remove PMIC INTC irq trigger type
  arm64: dts: rockchip: Enable tsadc device on rock960
  ARM: dts: rockchip: add chosen node on veyron devices
  ...
2019-03-06 09:36:37 -08:00
Felipe Franciosi
3722e6a521 scsi: virtio_scsi: don't send sc payload with tmfs
The virtio scsi spec defines struct virtio_scsi_ctrl_tmf as a set of
device-readable records and a single device-writable response entry:

    struct virtio_scsi_ctrl_tmf
    {
        // Device-readable part
        le32 type;
        le32 subtype;
        u8 lun[8];
        le64 id;
        // Device-writable part
        u8 response;
    }

The above should be organised as two descriptor entries (or potentially
more if using VIRTIO_F_ANY_LAYOUT), but without any extra data after "le64
id" or after "u8 response".

The Linux driver doesn't respect that, with virtscsi_abort() and
virtscsi_device_reset() setting cmd->sc before calling virtscsi_tmf().  It
results in the original scsi command payload (or writable buffers) added to
the tmf.

This fixes the problem by leaving cmd->sc zeroed out, which makes
virtscsi_kick_cmd() add the tmf to the control vq without any payload.

Cc: stable@vger.kernel.org
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 12:35:02 -05:00
Linus Torvalds
aebbfafc74 Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC platform updates from Arnd Bergmann:
 "The APM X-Gene platform is now maintained by folks from Ampere
  computing that took over the product line a while ago, this gets
  reflected in the MAINTAINERS file.

  Cleanups continue on the older mach-davinci and mach-pxa platform, to
  get them to be more like the modern ones. For pxa, we now remove the
  Raumfeld platform code as it now works with device tree based booting.

  i.MX adds a couple new features for the i.MX7ULP SoC

  Mediatek gains support for a new SoC: MT7629 is a new wireless router
  platform, following MT7623.

  Aside from those, there are the usual minor cleanups and bugfixes
  across several platforms"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (49 commits)
  MAINTAINERS: Update Ampere email address
  usb: ohci-da8xx: remove unused callbacks from platform data
  ARM: davinci: da830-evm: remove legacy usb helpers
  ARM: davinci: omapl138-hawk: remove legacy usb helpers
  usb: ohci-da8xx: add vbus and overcurrent gpios
  ARM: davinci: da830-evm: use gpio lookup entries for usb gpios
  ARM: davinci: omapl138-hawk: use gpio lookup entries for usb gpios
  usb: ohci-da8xx: add a helper pointer to &pdev->dev
  usb: ohci-da8xx: add a new line after local variables
  arm64: meson: enable g12a clock controller
  MAINTAINERS: Add entry for uDPU board
  ARM: davinci: da850-evm: use GPIO hogs instead of the legacy API
  arm: mediatek: add MT7629 smp bring up code
  Revert "ARM: mediatek: add MT7623a smp bringup code"
  dt-bindings: soc: fix typo of MT8173 power dt-bindings
  ARM: meson: remove COMMON_CLK_AMLOGIC selection
  arm64: meson: remove COMMON_CLK_AMLOGIC selection
  ARM: lpc32xx: remove platform data of ARM PL111 LCD controller
  ARM: lpc32xx: remove platform data of ARM PL180 SD/MMC controller
  ARM: lpc32xx: Use kmemdup to replace duplicating its implementation
  ...
2019-03-06 09:33:05 -08:00
Erwan Velu
441b7195e2 scsi: smartpqi: Reporting 'logical unit failure'
When the HARDWARE_ERROR/0x3e/0x1 case is triggered, the logical volume
is offlined.  When reading the kernel log, the reason why the device
got offlined isn't reported to the user.  This situation makes it
difficult for admins to root cause.

Log a message when this condition occurs.

[mkp: tweaked commit message]

Signed-off-by: Erwan Velu <e.velu@criteo.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 12:31:35 -05:00
Linus Torvalds
fa29f5ba42 Merge tag 'asm-generic-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
 "Only a few small changes this time:

   - Michael S. Tsirkin cleans up linux/mman.h

   - Mike Rapoport found a typo

  I had originally merged another cleanup series for I/O accessors from
  Hugo Lefeuvre as well, but dropped it after the discussion of the
  barrier semantics and some conflicts. I expect this series to get
  merged for a later release though"

* tag 'asm-generic-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic/page.h: fix typo in #error text requiring a real asm/page.h
  arch: move common mmap flags to linux/mman.h
  drm: tweak header name
  x86/mpx: tweak header name
2019-03-06 09:18:43 -08:00
Linus Torvalds
78e10b5e5a Merge tag 'y2038-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground
Pull y2038 build fix for compat mode from Arnd Bergmann:
 "Here is one more patch on top of the y2038 changes already pulled for
  linux-5.1, for some reason this had escaped all testing"

* tag 'y2038-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
  ipc: Fix building compat mode without sysvipc
2019-03-06 09:07:08 -08:00
Jonathan Corbet
4064174bec docs: Bring some order to filesystem documentation
Documentation/filesystems is, like much of the rest of the kernel's
documentation, a jumble of unorganized information.  Split the
documentation into categories and try to bring some order to the top-level
index.rst files.  No text changes other than a few section-introductory
blurbs; this is all just moving stuff around.

Cc: linux-fsdevel@vger.kernel.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-03-06 09:46:10 -07:00
Linus Torvalds
6ea98b4baa Merge branch 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 alternative instruction updates from Ingo Molnar:
 "Small RDTSCP opimization, enabled by the newly added ALTERNATIVE_3(),
  and other small improvements"

* 'x86-alternatives-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/TSC: Use RDTSCP
  x86/alternatives: Add an ALTERNATIVE_3() macro
  x86/alternatives: Print containing function
  x86/alternatives: Add macro comments
2019-03-06 08:45:46 -08:00
Ming Lei
05b700ba60 block: fix segment calculation for passthrough IO
blk_recount_segments() can be called in bio_add_pc_page() for
calculating how many segments this bio will has after one page is added
to this bio. If the resulted segment number is beyond the queue limit,
the added page will be removed.

The try-and-fix policy requires blk_recount_segments(__blk_recalc_rq_segments)
to not consider the segment number limit. Unfortunately bvec_split_segs()
does check this limit, and causes small segment number returned to
bio_add_pc_page(), then page still may be added to the bio even though
segment number limit becomes broken.

Fixes this issue by not considering segment number limit when calcualting
bio's segment number.

Fixes: dcebd75592 ("block: use bio_for_each_bvec() to compute multi-page bvec count")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-03-06 09:42:54 -07:00
Jens Axboe
e61750c847 Merge branch 'stable/for-jens-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-5.1/block-post
Pull two xen blkback fixes from Konrad.

* 'stable/for-jens-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/blkback: rework connect_ring() to avoid inconsistent xenstore 'ring-page-order' set by malicious blkfront
  xen/blkback: add stack variable 'blkif' in connect_ring()
2019-03-06 09:41:54 -07:00
Arnd Bergmann
cfdbb4ed31 vhost: silence an unused-variable warning
On some architectures, the MMU can be disabled, leading to access_ok()
becoming an empty macro that does not evaluate its size argument,
which in turn produces an unused-variable warning:

drivers/vhost/vhost.c:1191:9: error: unused variable 's' [-Werror,-Wunused-variable]
        size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;

Mark the variable as __maybe_unused to shut up that warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:32:37 -05:00
Cornelia Huck
ab7a2375fb virtio: hint if callbacks surprisingly might sleep
A virtio transport is free to implement some of the callbacks in
virtio_config_ops in a matter that they cannot be called from
atomic context (e.g. virtio-ccw, which maps a lot of the callbacks
to channel I/O, which is an inherently asynchronous mechanism).
This can be very surprising for developers using the much more
common virtio-pci transport, just to find out that things break
when used on s390.

The documentation for virtio_config_ops now contains a comment
explaining this, but it makes sense to add a might_sleep() annotation
to various wrapper functions in the virtio core to avoid surprises
later.

Note that annotations are NOT added to two classes of calls:
- direct calls from device drivers (all current callers should be
  fine, however)
- calls which clearly won't be made from atomic context (such as
  those ultimately coming in via the driver core)

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:57 -05:00
Cornelia Huck
971bedca26 virtio-ccw: wire up ->bus_name callback
Return the bus id of the ccw proxy device. This makes 'ethtool -i'
show a more useful value than 'virtio' in the bus-info field.

Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:56 -05:00
Halil Pasic
3438b2c039 s390/virtio: handle find on invalid queue gracefully
A queue with a capacity of zero is clearly not a valid virtio queue.
Some emulators report zero queue size if queried with an invalid queue
index. Instead of crashing in this case let us just return -ENOENT. To
make that work properly, let us fix the notifier cleanup logic as well.

Cc: stable@vger.kernel.org
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:44 -05:00
Cornelia Huck
8457fdfeb1 virtio-ccw: diag 500 may return a negative cookie
If something goes wrong in the kvm io bus handling, the virtio-ccw
diagnose may return a negative error value in the cookie gpr.

Document this.

Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:33 -05:00
Wei Wang
59f3397ca7 virtio_balloon: remove the unnecessary 0-initialization
We've changed to kzalloc the vb struct, so no need to 0-initialize
this field one more time.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2019-03-06 11:19:33 -05:00
Wei Wang
53e946cb34 virtio-balloon: improve update_balloon_size_func
There is no need to update the balloon actual register when there is no
ballooning request. This patch avoids update_balloon_size when diff is 0.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:33 -05:00
Joerg Roedel
fd1068e186 virtio-blk: Consider virtio_max_dma_size() for maximum segment size
Segments can't be larger than the maximum DMA mapping size
supported on the platform. Take that into account when
setting the maximum segment size for a block device.

Cc: stable@vger.kernel.org
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:26 -05:00
Joerg Roedel
e6d6dd6c87 virtio: Introduce virtio_max_dma_size()
This function returns the maximum segment size for a single
dma transaction of a virtio device. The possible limit comes
from the SWIOTLB implementation in the Linux kernel, that
has an upper limit of (currently) 256kb of contiguous
memory it can map. Other DMA-API implementations might also
have limits.

Use the new dma_max_mapping_size() function to determine the
maximum mapping size when DMA-API is in use for virtio.

Cc: stable@vger.kernel.org
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:15 -05:00
Joerg Roedel
133d624b1c dma: Introduce dma_max_mapping_size()
The function returns the maximum size that can be mapped
using DMA-API functions. The patch also adds the
implementation for direct DMA and a new dma_map_ops pointer
so that other implementations can expose their limit.

Cc: stable@vger.kernel.org
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:11 -05:00
Joerg Roedel
492366f7b4 swiotlb: Add is_swiotlb_active() function
This function will be used from dma_direct code to determine
the maximum segment size of a dma mapping.

Cc: stable@vger.kernel.org
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:19:06 -05:00
Joerg Roedel
abe420bfae swiotlb: Introduce swiotlb_max_mapping_size()
The function returns the maximum size that can be remapped
by the SWIOTLB implementation. This function will be later
exposed to users through the DMA-API.

Cc: stable@vger.kernel.org
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-03-06 11:18:50 -05:00
Linus Torvalds
45802da05e Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - refcount conversions

   - Solve the rq->leaf_cfs_rq_list can of worms for real.

   - improve power-aware scheduling

   - add sysctl knob for Energy Aware Scheduling

   - documentation updates

   - misc other changes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  kthread: Do not use TIMER_IRQSAFE
  kthread: Convert worker lock to raw spinlock
  sched/fair: Use non-atomic cpumask_{set,clear}_cpu()
  sched/fair: Remove unused 'sd' parameter from select_idle_smt()
  sched/wait: Use freezable_schedule() when possible
  sched/fair: Prune, fix and simplify the nohz_balancer_kick() comment block
  sched/fair: Explain LLC nohz kick condition
  sched/fair: Simplify nohz_balancer_kick()
  sched/topology: Fix percpu data types in struct sd_data & struct s_data
  sched/fair: Simplify post_init_entity_util_avg() by calling it with a task_struct pointer argument
  sched/fair: Fix O(nr_cgroups) in the load balancing path
  sched/fair: Optimize update_blocked_averages()
  sched/fair: Fix insertion in rq->leaf_cfs_rq_list
  sched/fair: Add tmp_alone_branch assertion
  sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock()
  sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK
  sched/pelt: Skip updating util_est when utilization is higher than CPU's capacity
  sched/fair: Update scale invariance of PELT
  sched/fair: Move the rq_of() helper function
  sched/core: Convert task_struct.stack_refcount to refcount_t
  ...
2019-03-06 08:14:05 -08:00
Linus Torvalds
203b6609e0 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Lots of tooling updates - too many to list, here's a few highlights:

   - Various subcommand updates to 'perf trace', 'perf report', 'perf
     record', 'perf annotate', 'perf script', 'perf test', etc.

   - CPU and NUMA topology and affinity handling improvements,

   - HW tracing and HW support updates:
      - Intel PT updates
      - ARM CoreSight updates
      - vendor HW event updates

   - BPF updates

   - Tons of infrastructure updates, both on the build system and the
     library support side

   - Documentation updates.

   - ... and lots of other changes, see the changelog for details.

  Kernel side updates:

   - Tighten up kprobes blacklist handling, reduce the number of places
     where developers can install a kprobe and hang/crash the system.

   - Fix/enhance vma address filter handling.

   - Various PMU driver updates, small fixes and additions.

   - refcount_t conversions

   - BPF updates

   - error code propagation enhancements

   - misc other changes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
  perf script python: Add Python3 support to syscall-counts-by-pid.py
  perf script python: Add Python3 support to syscall-counts.py
  perf script python: Add Python3 support to stat-cpi.py
  perf script python: Add Python3 support to stackcollapse.py
  perf script python: Add Python3 support to sctop.py
  perf script python: Add Python3 support to powerpc-hcalls.py
  perf script python: Add Python3 support to net_dropmonitor.py
  perf script python: Add Python3 support to mem-phys-addr.py
  perf script python: Add Python3 support to failed-syscalls-by-pid.py
  perf script python: Add Python3 support to netdev-times.py
  perf tools: Add perf_exe() helper to find perf binary
  perf script: Handle missing fields with -F +..
  perf data: Add perf_data__open_dir_data function
  perf data: Add perf_data__(create_dir|close_dir) functions
  perf data: Fail check_backup in case of error
  perf data: Make check_backup work over directories
  perf tools: Add rm_rf_perf_data function
  perf tools: Add pattern name checking to rm_rf
  perf tools: Add depth checking to rm_rf
  perf data: Add global path holder
  ...
2019-03-06 07:59:36 -08:00
Linus Torvalds
3478588b51 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The biggest part of this tree is the new auto-generated atomics API
  wrappers by Mark Rutland.

  The primary motivation was to allow instrumentation without uglifying
  the primary source code.

  The linecount increase comes from adding the auto-generated files to
  the Git space as well:

    include/asm-generic/atomic-instrumented.h     | 1689 ++++++++++++++++--
    include/asm-generic/atomic-long.h             | 1174 ++++++++++---
    include/linux/atomic-fallback.h               | 2295 +++++++++++++++++++++++++
    include/linux/atomic.h                        | 1241 +------------

  I preferred this approach, so that the full call stack of the (already
  complex) locking APIs is still fully visible in 'git grep'.

  But if this is excessive we could certainly hide them.

  There's a separate build-time mechanism to determine whether the
  headers are out of date (they should never be stale if we do our job
  right).

  Anyway, nothing from this should be visible to regular kernel
  developers.

  Other changes:

   - Add support for dynamic keys, which removes a source of false
     positives in the workqueue code, among other things (Bart Van
     Assche)

   - Updates to tools/memory-model (Andrea Parri, Paul E. McKenney)

   - qspinlock, wake_q and lockdep micro-optimizations (Waiman Long)

   - misc other updates and enhancements"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  locking/lockdep: Shrink struct lock_class_key
  locking/lockdep: Add module_param to enable consistency checks
  lockdep/lib/tests: Test dynamic key registration
  lockdep/lib/tests: Fix run_tests.sh
  kernel/workqueue: Use dynamic lockdep keys for workqueues
  locking/lockdep: Add support for dynamic keys
  locking/lockdep: Verify whether lock objects are small enough to be used as class keys
  locking/lockdep: Check data structure consistency
  locking/lockdep: Reuse lock chains that have been freed
  locking/lockdep: Fix a comment in add_chain_cache()
  locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()
  locking/lockdep: Reuse list entries that are no longer in use
  locking/lockdep: Free lock classes that are no longer in use
  locking/lockdep: Update two outdated comments
  locking/lockdep: Make it easy to detect whether or not inside a selftest
  locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock()
  locking/lockdep: Initialize the locks_before and locks_after lists earlier
  locking/lockdep: Make zap_class() remove all matching lock order entries
  locking/lockdep: Reorder struct lock_class members
  locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache
  ...
2019-03-06 07:17:17 -08:00
Linus Torvalds
c8f5ed6ef9 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The main EFI changes in this cycle were:

   - Use 32-bit alignment for efi_guid_t

   - Allow the SetVirtualAddressMap() call to be omitted

   - Implement earlycon=efifb based on existing earlyprintk code

   - Various minor fixes and code cleanups from Sai, Ard and me"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: Fix build error due to enum collision between efi.h and ima.h
  efi/x86: Convert x86 EFI earlyprintk into generic earlycon implementation
  x86: Make ARCH_USE_MEMREMAP_PROT a generic Kconfig symbol
  efi/arm/arm64: Allow SetVirtualAddressMap() to be omitted
  efi: Replace GPL license boilerplate with SPDX headers
  efi/fdt: Apply more cleanups
  efi: Use 32-bit alignment for efi_guid_t
  efi/memattr: Don't bail on zero VA if it equals the region's PA
  x86/efi: Mark can_free_region() as an __init function
2019-03-06 07:13:56 -08:00
Arnd Bergmann
7e89a37c47 ipc: Fix building compat mode without sysvipc
As John Stultz noticed, my y2038 syscall series caused a link
failure when CONFIG_SYSVIPC is disabled but CONFIG_COMPAT is
enabled:

arch/arm64/kernel/sys32.o:(.rodata+0x960): undefined reference to `__arm64_compat_sys_old_semctl'
arch/arm64/kernel/sys32.o:(.rodata+0x980): undefined reference to `__arm64_compat_sys_old_msgctl'
arch/arm64/kernel/sys32.o:(.rodata+0x9a0): undefined reference to `__arm64_compat_sys_old_shmctl'

Add the missing entries in kernel/sys_ni.c for the new system
calls.

Cc: Laura Abbott <labbott@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2019-03-06 16:00:56 +01:00
Yihao Wu
dd838821f0 nfsd: fix wrong check in write_v4_end_grace()
Commit 62a063b8e7 "nfsd4: fix crash on writing v4_end_grace before
nfsd startup" is trying to fix a NULL dereference issue, but it
mistakenly checks if the nfsd server is started. So fix it.

Fixes: 62a063b8e7 "nfsd4: fix crash on writing v4_end_grace before nfsd startup"
Cc: stable@vger.kernel.org
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Yihao Wu <wuyihao@linux.alibaba.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2019-03-06 09:58:20 -05:00
Mikulas Patocka
2255574468 dm integrity: limit the rate of error messages
When using dm-integrity underneath md-raid, some tests with raid
auto-correction trigger large amounts of integrity failures - and all
these failures print an error message. These messages can bring the
system to a halt if the system is using serial console.

Fix this by limiting the rate of error messages - it improves the speed
of raid recovery and avoids the hang.

Fixes: 7eada909bf ("dm: add integrity target")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-03-06 09:03:00 -05:00
Tim Smith
7c03e756b4 gfs2: Fix an incorrect gfs2_assert()
When updating the inode information after a change in allocation,
convert the change into the same units as the inode's i_blocks count
before comparing it in an assertion.

Also, change the comparison so that it is still possible to set i_blocks
to zero by adding -i_blocks, something that was previously only possible
because of the difference in units.

Signed-off-by: Tim Smith <tim.smith@citrix.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
2019-03-06 07:00:43 -07:00
Martin Schwidefsky
152e9b8676 s390/vtime: steal time exponential moving average
To be able to judge the current overcommitment ratio for a CPU add
a lowcore field with the exponential moving average of the steal time.
The average is updated every tick.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-06 14:59:50 +01:00
Yang Wei
a53837a545 perf clang: Remove needless extra semicolon
Delete a superfluous semicolon in getBPFObjectFromModule().

Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Wei <albin_yang@163.com>
Link: http://lkml.kernel.org/r/1551710174-3349-1-git-send-email-albin_yang@163.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06 09:47:48 -03:00
Arnaldo Carvalho de Melo
3163613c5b perf bpf: Automatically add BTF ELF markers
The libbpf loader expects that some __btf_map_<MAP_NAME> structs be in
place with the keys and values types of maps so that one can store the
struct definitions and have them sent to the kernel via sys_bpf(fd, cmd
= BTF_LOAD) and then later be retrievable via sys_bpf(fd, cmd =
BPF_OBJ_GET_INFO_BY_FD) for use by tools such as 'bpftool map dump id
MAP_ID'.

Since we already have this for defining maps in 'perf trace' BPF events:

   bpf_map(name, _type, type_key, type_val, _max_entries)

As used in the tools/perf/examples/bpf/augmented_raw_syscalls.c:

 --- 8< ---

struct syscall {
        bool    enabled;
};

bpf_map(syscalls, ARRAY, int, struct syscall, 512);

 --- 8< ---

All we need is to get all that already available info, piggyback on the
'bpf_map' define in tools/perf/include/bpf/bpf.h, that is included by
'perf trace' BPF programs and do that without requiring changes to the
BPF programs already defining maps using 'bpf_map()'.

So this is what we have before this patch:

1) With this in ~/.perfconfig to dump .c events as .o, aka save a copy
   so that we can use the .o later as a pre-compiled BPF bytecode:

  # grep '\[llvm\]' -A2 ~/.perfconfig
  [llvm]
	dump-obj = true
	clang-opt = -g

  #
  # clang --version
  clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
  Target: x86_64-unknown-linux-gnu
  Thread model: posix
  InstalledDir: /opt/llvm/bin

2) Note the -g there so that we get clang to generate debuginfo, and
   since the target is 'bpf' it will generate the BTF info in this
   clang version (9.0).

3) Run a simple 'perf record' specifiying as an event the augmented_raw_syscalls.c
   source code:

  # perf record -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
  LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data ]

  # file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped

4) Look at the BTF structs encoded in it:

  # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  syscall_enter_args	64	0
  augmented_filename	264	0
  syscall	1	0
  syscall_exit_args	24	0
  bpf_map	28	0
  #
  # pahole -F btf -C syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  # pahole -F btf -C syscall /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  struct syscall {
	  bool                       enabled;              /*     0     1 */

	  /* size: 1, cachelines: 1, members: 1 */
	  /* last cacheline: 1 bytes */
  };
  #

5) Ok, with just this we don't have the markers expected by the libbpf
   loader and when we run with this BPF bytecode, because we have:

  # grep '\[trace\]' -A1 ~/.perfconfig
  [trace]
	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  #

6) Lets do a 'perf trace' system wide session using this BPF program:

   # perf trace -e *mmsg,open*
  Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR) = 106
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  DNS Res~ver #3/23340 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 106
  DNS Res~ver #3/23340 sendmmsg(106<socket:[3482690]>, 0x7f252f1fcaf0, 2, MSG_NOSIGNAL) = 2
  Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR) = 106
  lighttpd/18915 openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 12

7) While it runs lets see the maps that 'perf trace' + libbpf's BPF
  loader loaded into the kernel via sys_bpf(fd, BPF_BTF_LOAD, ...):

  # bpftool map list | tail -6
  149: perf_event_array  name __augmented_sys  flags 0x0
	  key 4B  value 4B  max_entries 8  memlock 4096B
  150: array  name syscalls  flags 0x0
	  key 4B  value 1B  max_entries 512  memlock 8192B
  151: hash  name pids_filtered  flags 0x0
	  key 4B  value 1B  max_entries 64  memlock 8192B
  #

8) Dump the "pids_filtered", map, that will have one entry per PID that
   'perf trace' wants filtered, which includes its own, to avoid a
   tracing feedback loop (perf trace shows the syscalls it does which
   generates more syscalls that it has to show that...), it also
   auto-filters the 'gnome-terminal' and 'sshd' parent PIDs, for the
   same reason:

  # bpftool map dump id 151
  key: a5 0c 00 00  value: 01
  key: 14 63 00 00  value: 01
  Found 2 elements
  #

9) Since there is no BTF info available, it does a generic hex dump :-\

10) Now, with this patch applied, we'll do steps 3 to 6 again and look
    with pahole if there are extra structs encoded in BTF:

  # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  syscall_enter_args	64	0
  augmented_filename	264	0
  syscall	1	0
  syscall_exit_args	24	0
  bpf_map	28	0
  ____btf_map___augmented_syscalls__	8	0
  ____btf_map_syscalls	8	0
  ____btf_map_pids_filtered	8	0
  #

11) Yes, those __btf_map_ + the map names, lets see how they look like:

  # pahole -F btf -C ____btf_map_syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  struct ____btf_map_syscalls {
	  int                        key;                  /*     0     4 */
	  struct syscall             value;                /*     4     1 */

	  /* size: 8, cachelines: 1, members: 2 */
	  /* padding: 3 */
	  /* last cacheline: 8 bytes */
  };
  #

12) Lets repeat step 7 to get the new map ids:

  # bpftool map list | tail -6
  155: perf_event_array  name __augmented_sys  flags 0x0
	  key 4B  value 4B  max_entries 8  memlock 4096B
  156: array  name syscalls  flags 0x0
	  key 4B  value 1B  max_entries 512  memlock 8192B
  157: hash  name pids_filtered  flags 0x0
	  key 4B  value 1B  max_entries 64  memlock 8192B
  #

13) And finally lets dump the 'pids_filtered':

  # bpftool map dump id 157
  [{
        "key": 3237,
        "value": true
    },{
        "key": 26435,
        "value": true
    }
  ]
  #

Looks much better! BTF info was used to interpret the key as an integer
and the value as a struct with just one boolean member, so to make it
more compact, show just the 'true' value where we saw '01'.

Now to make 'perf trace --dump-map' to use BTF!

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-ybuf9wpkm30xk28iq7jbwb40@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-03-06 09:45:37 -03:00
Rafael J. Wysocki
2c0bf86c7c Merge tag 'linux-cpupower-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux
Pull cpupower updates for 5.1-rc1 from Shuah Khan:

"This cpupower update for Linux 5.1-rc1 consists of a patch to add
 support to display boost frequency separately from Abhishek Goel."

* tag 'linux-cpupower-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
  tools/power/cpupower: Display boost frequency separately
2019-03-06 10:40:23 +01:00
Harald Freudenberger
01396a374c s390/zcrypt: revisit ap device remove procedure
Working with the vfio-ap driver let to some revisit of the way
how an ap (queue) device is removed from the driver.
With the current implementation all the cleanup was done before
the driver even got notified about the removal. Now the ap
queue removal is done in 3 steps:
1) A preparation step, all ap messages within the queue
   are flushed and so the driver does 'receive' them.
   Also a new state AP_STATE_REMOVE assigned to the queue
   makes sure there are no new messages queued in.
2) Now the driver's remove function is invoked and the
   driver should do the job of cleaning up it's internal
   administration lists or whatever. After 2) is done
   it is guaranteed, that the driver is not invoked any
   more. On the other hand the driver has to make sure
   that the APQN is not accessed any more after step 2
   is complete.
3) Now the ap bus code does the job of total cleanup of the
   APQN. A reset with zero is triggered and the state of
   the queue goes to AP_STATE_UNBOUND.
   After step 3) is complete, the ap queue has no pending
   messages and the APQN is cleared and so there are no
   requests and replies lingering around in the firmware
   queue for this APQN. Also the interrupts are disabled.

After these remove steps the ap queue device may be assigned
to another driver.

Stress testing this remove/probe procedure showed a problem with the
correct module reference counting. The actual receive of an reply in
the driver is done asynchronous with completions. So with a driver
change on an ap queue the message flush triggers completions but the
threads waiting for the completions may run at a time where the queue
already has the new driver assigned. So the module_put() at receive
time needs to be done on the driver module which queued the ap
message. This change is also part of this patch.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-06 09:34:03 +01:00
Martin Schwidefsky
cd479eccd2 s390: limit brk randomization to 32MB
For a 64-bit process the randomization of the program break is quite
large with 1GB. That is as big as the randomization of the anonymous
mapping base, for a test case started with '/lib/ld64.so.1 <exec>'
it can happen that the heap is placed after the stack. To avoid
this limit the program break randomization to 32MB for 64-bit and
keep 8MB for 31-bit.

Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2019-03-06 09:34:03 +01:00
Peter Zijlstra (Intel)
400816f60c perf/x86/intel: Implement support for TSX Force Abort
Skylake (and later) will receive a microcode update to address a TSX
errata. This microcode will, on execution of a TSX instruction
(speculative or not) use (clobber) PMC3. This update will also provide
a new MSR to change this behaviour along with a CPUID bit to enumerate
the presence of this new MSR.

When the MSR gets set; the microcode will no longer use PMC3 but will
Force Abort every TSX transaction (upon executing COMMIT).

When TSX Force Abort (TFA) is allowed (default); the MSR gets set when
PMC3 gets scheduled and cleared when, after scheduling, PMC3 is
unused.

When TFA is not allowed; clear PMC3 from all constraints such that it
will not get used.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2019-03-06 09:25:41 +01:00