Jason Gunthorpe
6989aa62d3
Merge tag 'v5.9-rc3' into rdma.git for-next
...
Required due to dependencies in following patches.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-31 12:28:12 -03:00
Mark Zhang
7c4b1ab9f1
IB/mlx5: Add DCT RoCE LAG support
...
When DCT QPs work in RoCE LAG mode:
1. DCT creation is allowed only when it is supported
2. The "port" of a DCT QP is assigned in a round-robin way
Link: https://lore.kernel.org/r/20200818115245.700581-3-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com >
Reviewed-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-27 08:34:28 -03:00
Mark Zhang
8f3243a047
IB/mlx5: Add tx_affinity support for DCI QP
...
DCI QP supports tx_affinity as well.
Link: https://lore.kernel.org/r/20200818115245.700581-2-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com >
Reviewed-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-27 08:34:28 -03:00
Gustavo A. R. Silva
df561f6688
treewide: Use fallthrough pseudo-keyword
...
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.
[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org >
2020-08-23 17:36:59 -05:00
Maor Gottlieb
e6ac9f6006
RDMA/mlx5: Enable sniffer when device is in switchdev mode
...
In order to allow sniffer when the RDMA device is in switchdev mode, we
don't need to set the source port when creating the sniffer rule.
Link: https://lore.kernel.org/r/20200803060214.15328-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Reviewed-by: Mark Bloch <markb@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-18 15:03:32 -03:00
Mark Zhang
c531024bb1
RDMA/mlx5: Add new IB rates support
...
Support 56, 25, 100, 200 and 50Gbps IB rates in mlx5 driver.
Link: https://lore.kernel.org/r/20200802081712.1993490-1-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-18 15:03:32 -03:00
Leon Romanovsky
d6673746d6
RDMA: Remove constant domain argument from flow creation call
...
The "domain" argument is constant and modern device (mlx5) doesn't support
anything except IB_FLOW_DOMAIN_USER, so delete this extra parameter and
simplify code.
Link: https://lore.kernel.org/r/20200730081235.1581127-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-18 14:47:34 -03:00
Leon Romanovsky
70c1430fba
RDMA/mlx5: Replace open-coded offsetofend() macro
...
Clean mlx5_ib from open-coded implementations of offsetofend().
Link: https://lore.kernel.org/r/20200730081235.1581127-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-18 14:47:34 -03:00
Leon Romanovsky
156f378985
RDMA/mlx5: Simplify multiple else-if cases with switch keyword
...
Improve readability of fs.c by converting multiple else-if constructions
to be implemented with switch keyword.
Link: https://lore.kernel.org/r/20200730081235.1581127-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-18 14:47:34 -03:00
Linus Torvalds
d7806bbd22
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
...
Pull rdma updates from Jason Gunthorpe:
"A quiet cycle after the larger 5.8 effort. Substantially cleanup and
driver work with a few smaller features this time.
- Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re
- Removal of dead or redundant code across the drivers
- RAW resource tracker dumps to include a device specific data blob
for device objects to aide device debugging
- Further advance the IOCTL interface, remove the ability to turn it
off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands
- Remove stubs related to devices with no pkey table
- A shared CQ scheme to allow multiple ULPs to share the CQ rings of
a device to give higher performance
- Several more static checker, syzkaller and rare crashers fixed"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits)
RDMA/mlx5: Fix flow destination setting for RDMA TX flow table
RDMA/rxe: Remove pkey table
RDMA/umem: Add a schedule point in ib_umem_get()
RDMA/hns: Fix the unneeded process when getting a general type of CQE error
RDMA/hns: Fix error during modify qp RTS2RTS
RDMA/hns: Delete unnecessary memset when allocating VF resource
RDMA/hns: Remove redundant parameters in set_rc_wqe()
RDMA/hns: Remove support for HIP08_A
RDMA/hns: Refactor hns_roce_v2_set_hem()
RDMA/hns: Remove redundant hardware opcode definitions
RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP
RDMA/include: Replace license text with SPDX tags
RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting
RDMA/cma: Execute rdma_cm destruction from a handler properly
RDMA/cma: Remove unneeded locking for req paths
RDMA/cma: Using the standard locking pattern when delivering the removal event
RDMA/cma: Simplify DEVICE_REMOVAL for internal_id
RDMA/efa: Add EFA 0xefa1 PCI ID
RDMA/efa: User/kernel compatibility handshake mechanism
...
2020-08-06 16:43:36 -07:00
Michael Guralnik
23fcc7dee2
RDMA/mlx5: Fix flow destination setting for RDMA TX flow table
...
For RDMA TX flow table, set destination type to be 'port' and prevent
creation of flows with TIR destination.
As RDMA TX is an egress flow table the rules on this flow table should
not forward traffic back to the NIC and should set the destination to be
the port.
Without the setting of this destination type flow rules on the RDMA TX
flow tables are not created as FW invokes a syndrome for undefined
destination for the rule.
Fixes: 24670b1a31
("net/mlx5: Add support for RDMA TX steering")
Link: https://lore.kernel.org/r/20200803055849.14947-1-leon@kernel.org
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-08-05 21:09:39 -03:00
Linus Torvalds
99ea1521a0
Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
...
Pull uninitialized_var() macro removal from Kees Cook:
"This is long overdue, and has hidden too many bugs over the years. The
series has several "by hand" fixes, and then a trivial treewide
replacement.
- Clean up non-trivial uses of uninitialized_var()
- Update documentation and checkpatch for uninitialized_var() removal
- Treewide removal of uninitialized_var()"
* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
compiler: Remove uninitialized_var() macro
treewide: Remove uninitialized_var() usage
checkpatch: Remove awareness of uninitialized_var() macro
mm/debug_vm_pgtable: Remove uninitialized_var() usage
f2fs: Eliminate usage of uninitialized_var() macro
media: sur40: Remove uninitialized_var() usage
KVM: PPC: Book3S PR: Remove uninitialized_var() usage
clk: spear: Remove uninitialized_var() usage
clk: st: Remove uninitialized_var() usage
spi: davinci: Remove uninitialized_var() usage
ide: Remove uninitialized_var() usage
rtlwifi: rtl8192cu: Remove uninitialized_var() usage
b43: Remove uninitialized_var() usage
drbd: Remove uninitialized_var() usage
x86/mm/numa: Remove uninitialized_var() usage
docs: deprecated.rst: Add uninitialized_var()
2020-08-04 13:49:43 -07:00
Leon Romanovsky
7fa84b5708
RDMA/mlx5: Initialize QP mutex for the debug kernels
...
In DCT and RSS RAW QP creation flows, the QP mutex wasn't initialized and
the magic field inside lock was missing. This caused to the following
kernel warning for kernels build with CONFIG_DEBUG_MUTEXES.
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 3 PID: 16261 at kernel/locking/mutex.c:938 __mutex_lock+0x60e/0x940
Modules linked in: bonding nf_tables ipip tunnel4 geneve ip6_udp_tunnel udp_tunnel ip6_gre ip6_tunnel tunnel6 ip_gre gre ip_tunnel mlx5_ib mlx5_core mlxfw ptp pps_core rdma_ucm ib_uverbs ib_ipoib ib_umad openvswitch nsh xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp scsi_transport_srp rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core [last unloaded: mlxfw]
CPU: 3 PID: 16261 Comm: ib_send_bw Not tainted 5.8.0-rc4_for_upstream_min_debug_2020_07_08_22_04 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
RIP: 0010:__mutex_lock+0x60e/0x940
Code: c0 0f 84 6d fa ff ff 44 8b 15 4e 9d ba 00 45 85 d2 0f 85 5d fa ff ff 48 c7 c6 f2 de 2b 82 48 c7 c7 f1 8a 2b 82 e8 d2 4d 72 ff <0f> 0b 4c 8b 4d 88 e9 3f fa ff ff f6 c2 04 0f 84 37 fe ff ff 48 89
RSP: 0018:ffff88810bb8b870 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88829f1dd880 RSI: 0000000000000000 RDI: ffffffff81192afa
RBP: ffff88810bb8b910 R08: 0000000000000000 R09: 0000000000000028
R10: 0000000000000000 R11: 0000000000003f85 R12: 0000000000000002
R13: ffff88827d8d3ce0 R14: ffffffffa059f615 R15: ffff8882a4d02610
FS: 00007f3f6988e740(0000) GS:ffff8882f5b80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556556158000 CR3: 000000010a63c005 CR4: 0000000000360ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? cmd_exec+0x947/0xe60 [mlx5_core]
? __mutex_lock+0x76/0x940
? mlx5_ib_qp_set_counter+0x25/0xa0 [mlx5_ib]
mlx5_ib_qp_set_counter+0x25/0xa0 [mlx5_ib]
mlx5_ib_counter_bind_qp+0x9b/0xe0 [mlx5_ib]
__rdma_counter_bind_qp+0x6b/0xa0 [ib_core]
rdma_counter_bind_qp_auto+0x363/0x520 [ib_core]
_ib_modify_qp+0x316/0x580 [ib_core]
ib_modify_qp_with_udata+0x19/0x30 [ib_core]
modify_qp+0x4c4/0x600 [ib_uverbs]
ib_uverbs_ex_modify_qp+0x87/0xe0 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x129/0x1c0 [ib_uverbs]
ib_uverbs_cmd_verbs.isra.5+0x5d5/0x11f0 [ib_uverbs]
? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x120/0x120 [ib_uverbs]
? lock_acquire+0xb9/0x3a0
? ib_uverbs_ioctl+0xd0/0x210 [ib_uverbs]
? ib_uverbs_ioctl+0x175/0x210 [ib_uverbs]
ib_uverbs_ioctl+0x14b/0x210 [ib_uverbs]
? ib_uverbs_ioctl+0xd0/0x210 [ib_uverbs]
ksys_ioctl+0x234/0x7d0
? exc_page_fault+0x202/0x640
? do_syscall_64+0x1f/0x2e0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x59/0x2e0
? asm_exc_page_fault+0x8/0x30
? rcu_read_lock_sched_held+0x52/0x60
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: b4aaa1f0b4
("IB/mlx5: Handle type IB_QPT_DRIVER when creating a QP")
Link: https://lore.kernel.org/r/20200730082719.1582397-2-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-30 11:03:33 -03:00
Leon Romanovsky
81530ab08e
RDMA/mlx5: Allow providing extra scatter CQE QP flag
...
Scatter CQE feature relies on two flags MLX5_QP_FLAG_SCATTER_CQE and
MLX5_QP_FLAG_ALLOW_SCATTER_CQE, both of them can be provided without
relation to device capability.
Relax global validity check to allow MLX5_QP_FLAG_ALLOW_SCATTER_CQE QP
flag.
Existing user applications are failing on this new validity check.
Fixes: 90ecb37a75
("RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags")
Fixes: 37518fa49f
("RDMA/mlx5: Process all vendor flags in one place")
Link: https://lore.kernel.org/r/20200728120255.805733-1-leon@kernel.org
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-29 14:19:01 -03:00
Leon Romanovsky
71cab8ef5c
RDMA/mlx5: Delete unreachable code
...
Delete two occurrences of unreachable code discovered by the Coverity.
Link: https://lore.kernel.org/r/20200727095746.495915-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-28 16:25:37 -03:00
Jason Gunthorpe
5351a56b1a
RDMA/mlx5: Fix prefetch memory leak if get_prefetchable_mr fails
...
destroy_prefetch_work() must always be called if the work is not going
to be queued. The num_sge also should have been set to i, not i-1
which avoids the condition where it shouldn't have been called in the
first place.
Cc: stable@vger.kernel.org
Fixes: fb985e278a
("RDMA/mlx5: Use SRCU properly in ODP prefetch")
Link: https://lore.kernel.org/r/20200727095712.495652-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-27 11:50:20 -03:00
Jason Gunthorpe
7923774368
Merge branch 'mlx5_uar' into rdma.git /for-next
...
Meir Lichtinger says:
====================
ConnectX-7 supports setting relaxed ordering read/write mkey attribute by
UMR, indicated by new HCA capabilities, so extend mlx5_ib driver to
configure UMR control segment
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.
* branch 'mlx5_uar':
RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7
RDMA/mlx5: Use MLX5_SET macro instead of local structure
RDMA/mlx5: ConnectX-7 new capabilities to set relaxed ordering by UMR
2020-07-27 11:44:36 -03:00
Meir Lichtinger
896ec97353
RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7
...
Up to ConnectX-7 UMR is not used when user passes relaxed ordering access
flag. ConnectX-7 supports setting relaxed ordering read/write mkey
attribute by UMR, indicated by new HCA capabilities.
With ConnectX-7 driver uses UMR when user set relaxed ordering access
flag, in contrast to previous silicon models. Specifically it includes
setting relvant flags of mkey context mask in UMR control segment, and
relaxed ordering write and read flags in UMR mkey context segment.
Link: https://lore.kernel.org/r/20200716105248.1423452-4-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com >
Reviewed-by: Michael Guralnik <michaelgur@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-27 11:19:00 -03:00
Meir Lichtinger
2224635938
RDMA/mlx5: Use MLX5_SET macro instead of local structure
...
Use generic mlx5 structure defined in mlx5_ifc.h to represent ConnectX
device data structures instead of using structure defined specifically for
mlx5_ib module.
Link: https://lore.kernel.org/r/20200716105248.1423452-3-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-27 11:19:00 -03:00
Maor Gottlieb
d4d7f59643
RDMA/mlx5: Add missing srcu_read_lock in ODP implicit flow
...
According to the locking scheme, mlx5_ib_update_xlt() should be called
with srcu_read_lock(dev->odp->srcu). Prefetch missed this. This fixes the
below WARN from lockdep_assert_held():
WARNING: CPU: 1 PID: 1130 at drivers/infiniband/hw/mlx5/odp.c:132 mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib]
Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp scsi_transport_srp rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core mlx5_core mlxfw ptp pps_core
CPU: 1 PID: 1130 Comm: kworker/u16:11 Tainted: G W 5.8.0-rc5_for_upstream_debug_2020_07_13_11_04 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Workqueue: events_unbound mlx5_ib_prefetch_mr_work [mlx5_ib]
RIP: 0010:mlx5_odp_populate_xlt+0x175/0x180 [mlx5_ib]
Code: 08 e2 85 c0 0f 84 65 ff ff ff 49 8b 87 60 01 00 00 be ff ff ff ff 48 8d b8 b0 39 00 00 e8 93 e0 50 e1 85 c0 0f 85 45 ff ff ff <0f> 0b e9 3e ff ff ff 0f 0b eb c7 0f 1f 44 00 00 48 8b 87 98 0f 00
RSP: 0018:ffff88840f44fc68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88840cc9d000 RCX: ffff88840efcd940
RDX: 0000000000000000 RSI: ffff88844871b9b0 RDI: ffff88840efce100
RBP: ffff88840cc9d040 R08: 0000000000000040 R09: 0000000000000001
R10: ffff88846ced3068 R11: 0000000000000000 R12: 00000000000156ec
R13: 0000000000000004 R14: 0000000000000004 R15: ffff888439941000
FS: 0000000000000000(0000) GS:ffff88846fa80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8536d12430 CR3: 0000000437a5e006 CR4: 0000000000360ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
mlx5_ib_update_xlt+0x37c/0x7c0 [mlx5_ib]
pagefault_mr+0x315/0x440 [mlx5_ib]
mlx5_ib_prefetch_mr_work+0x56/0xa0 [mlx5_ib]
process_one_work+0x215/0x5c0
worker_thread+0x3c/0x380
? process_one_work+0x5c0/0x5c0
kthread+0x133/0x150
? kthread_park+0x90/0x90
ret_from_fork+0x1f/0x30
Hold the SRCU during prefetch, even though it strictly isn't needed since
prefetch is holding the num_deferred_work it does make it easier to reason
about.
Fixes: 5256edcb98
("RDMA/mlx5: Rework implicit ODP destroy")
Link: https://lore.kernel.org/r/20200719065747.131157-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-24 16:44:06 -03:00
Maor Gottlieb
c94e272b57
RDMA/mlx5: Allow SQ modification
...
Currently the SQ is set to a ready state when the RAW QP is modified to
INIT. When the TIS is modified, e.g. to change the lag_tx_affinity, then
SQs which are already in the ready state will not be affected.
Open a window to modify the SQ behavior by setting the SQ as ready only
when QP was modified to RTS.
Link: https://lore.kernel.org/r/20200716105416.1423826-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Reviewed-by: Mark Zhang <markz@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-24 15:49:19 -03:00
Jason Gunthorpe
a862192e92
RDMA/mlx5: Prevent prefetch from racing with implicit destruction
...
Prefetch work in mlx5_ib_prefetch_mr_work can be queued and able to run
concurrently with destruction of the implicit MR. The num_deferred_work
was intended to serialize this, but there is a race:
CPU0 CPU1
mlx5_ib_free_implicit_mr()
xa_erase(odp_mkeys)
synchronize_srcu()
__xa_erase(implicit_children)
mlx5_ib_prefetch_mr_work()
pagefault_mr()
pagefault_implicit_mr()
implicit_get_child_mr()
xa_cmpxchg()
atomic_dec_and_test(num_deferred_mr)
wait_event(imr->q_deferred_work)
ib_umem_odp_release(odp_imr)
kfree(odp_imr)
At this point in mlx5_ib_free_implicit_mr() the implicit_children list is
supposed to be empty forever so that destroy_unused_implicit_child_mr()
and related are not and will not be running.
Since it is not empty the destroy_unused_implicit_child_mr() flow ends up
touching deallocated memory as mlx5_ib_free_implicit_mr() already tore down the
imr parent.
The solution is to flush out the prefetch wq by driving num_deferred_work
to zero after creation of new prefetch work is blocked.
Fixes: 5256edcb98
("RDMA/mlx5: Rework implicit ODP destroy")
Link: https://lore.kernel.org/r/20200719065435.130722-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-21 13:51:35 -03:00
Kees Cook
3f649ab728
treewide: Remove uninitialized_var() usage
...
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.
In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:
git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
xargs perl -pi -e \
's/\buninitialized_var\(([^\)]+)\)/\1/g;
s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'
drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.
No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.
[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/
Reviewed-by: Leon Romanovsky <leonro@mellanox.com > # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com > # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org > # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com > # erofs
Signed-off-by: Kees Cook <keescook@chromium.org >
2020-07-16 12:35:15 -07:00
Daria Velikovsky
0829d2da60
RDMA/mlx5: Init dest_type when create flow
...
When using action drop dest_type was never assigned to any value. Add
initialization of dest_type to -1 since 0 is valid.
Fixes: f29de9eee7
("RDMA/mlx5: Add support for drop action in DV steering")
Link: https://lore.kernel.org/r/20200707110259.882276-1-leon@kernel.org
Signed-off-by: Daria Velikovsky <daria@mellanox.com >
Reviewed-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-16 14:11:53 -03:00
Maor Gottlieb
c3d6057e07
RDMA/mlx5: Use xa_lock_irq when access to SRQ table
...
SRQ table is accessed both from interrupt and process context,
therefore we must use xa_lock_irq.
inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
kworker/u17:9/8573 takes:
ffff8883e3503d30 (&xa->xa_lock#13){?...}-{2:2}, at: mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
{IN-HARDIRQ-W} state was registered at:
lock_acquire+0xb9/0x3a0
_raw_spin_lock+0x25/0x30
srq_event_notifier+0x2b/0xc0 [mlx5_ib]
notifier_call_chain+0x45/0x70
__atomic_notifier_call_chain+0x69/0x100
forward_event+0x36/0xc0 [mlx5_core]
notifier_call_chain+0x45/0x70
__atomic_notifier_call_chain+0x69/0x100
mlx5_eq_async_int+0xc5/0x160 [mlx5_core]
notifier_call_chain+0x45/0x70
__atomic_notifier_call_chain+0x69/0x100
mlx5_irq_int_handler+0x19/0x30 [mlx5_core]
__handle_irq_event_percpu+0x43/0x2a0
handle_irq_event_percpu+0x30/0x70
handle_irq_event+0x34/0x60
handle_edge_irq+0x7c/0x1b0
do_IRQ+0x60/0x110
ret_from_intr+0x0/0x2a
default_idle+0x34/0x160
do_idle+0x1ec/0x220
cpu_startup_entry+0x19/0x20
start_secondary+0x153/0x1a0
secondary_startup_64+0xa4/0xb0
irq event stamp: 20907
hardirqs last enabled at (20907): _raw_spin_unlock_irq+0x24/0x30
hardirqs last disabled at (20906): _raw_spin_lock_irq+0xf/0x40
softirqs last enabled at (20746): __do_softirq+0x2c9/0x436
softirqs last disabled at (20681): irq_exit+0xb3/0xc0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&xa->xa_lock#13);
<Interrupt>
lock(&xa->xa_lock#13);
*** DEADLOCK ***
2 locks held by kworker/u17:9/8573:
#0 : ffff888295218d38 ((wq_completion)mlx5_ib_page_fault){+.+.}-{0:0}, at: process_one_work+0x1f1/0x5f0
#1 : ffff888401647e78 ((work_completion)(&pfault->work)){+.+.}-{0:0}, at: process_one_work+0x1f1/0x5f0
stack backtrace:
CPU: 0 PID: 8573 Comm: kworker/u17:9 Tainted: GO 5.7.0_for_upstream_min_debug_2020_06_14_11_31_46_41 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib]
Call Trace:
dump_stack+0x71/0x9b
mark_lock+0x4f2/0x590
? print_shortest_lock_dependencies+0x200/0x200
__lock_acquire+0xa00/0x1eb0
lock_acquire+0xb9/0x3a0
? mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
_raw_spin_lock+0x25/0x30
? mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
mlx5_cmd_get_srq+0x18/0x70 [mlx5_ib]
mlx5_ib_eqe_pf_action+0x257/0xa30 [mlx5_ib]
? process_one_work+0x209/0x5f0
process_one_work+0x27b/0x5f0
? __schedule+0x280/0x7e0
worker_thread+0x2d/0x3c0
? process_one_work+0x5f0/0x5f0
kthread+0x111/0x130
? kthread_park+0x90/0x90
ret_from_fork+0x24/0x30
Fixes: e126ba97db
("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20200712102641.15210-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-16 09:28:16 -03:00
Gal Pressman
6c72a038bf
RDMA/mlx5: Remove unused to_mibmr function
...
The to_mibmr function is unused, remove it.
Link: https://lore.kernel.org/r/20200705141143.47303-1-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com >
Acked-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-10 16:40:39 -03:00
Leon Romanovsky
0a03715068
RDMA/mlx5: Set PD pointers for the error flow unwind
...
ib_pd is accessed internally during destroy of the TIR/TIS, but PD
can be not set yet. This leading to the following kernel panic.
BUG: kernel NULL pointer dereference, address: 0000000000000074
PGD 8000000079eaa067 P4D 8000000079eaa067 PUD 7ae81067 PMD 0 Oops: 0000 [#1 ] SMP PTI
CPU: 1 PID: 709 Comm: syz-executor.0 Not tainted 5.8.0-rc3 #41 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
RIP: 0010:destroy_raw_packet_qp_tis drivers/infiniband/hw/mlx5/qp.c:1189 [inline]
RIP: 0010:destroy_raw_packet_qp drivers/infiniband/hw/mlx5/qp.c:1527 [inline]
RIP: 0010:destroy_qp_common+0x2ca/0x4f0 drivers/infiniband/hw/mlx5/qp.c:2397
Code: 00 85 c0 74 2e e8 56 18 55 ff 48 8d b3 28 01 00 00 48 89 ef e8 d7 d3 ff ff 48 8b 43 08 8b b3 c0 01 00 00 48 8b bd a8 0a 00 00 <0f> b7 50 74 e8 0d 6a fe ff e8 28 18 55 ff 49 8d 55 50 4c 89 f1 48
RSP: 0018:ffffc900007bbac8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88807949e800 RCX: 0000000000000998
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88807c180140
RBP: ffff88807b50c000 R08: 000000000002d379 R09: ffffc900007bba00
R10: 0000000000000001 R11: 000000000002d358 R12: ffff888076f37000
R13: ffff88807949e9c8 R14: ffffc900007bbe08 R15: ffff888076f37000
FS: 00000000019bf940(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000074 CR3: 0000000076d68004 CR4: 0000000000360ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
mlx5_ib_create_qp+0xf36/0xf90 drivers/infiniband/hw/mlx5/qp.c:3014
_ib_create_qp drivers/infiniband/core/core_priv.h:333 [inline]
create_qp+0x57f/0xd20 drivers/infiniband/core/uverbs_cmd.c:1443
ib_uverbs_create_qp+0xcf/0x100 drivers/infiniband/core/uverbs_cmd.c:1564
ib_uverbs_write+0x5fa/0x780 drivers/infiniband/core/uverbs_main.c:664
__vfs_write+0x3f/0x90 fs/read_write.c:495
vfs_write+0xc7/0x1f0 fs/read_write.c:559
ksys_write+0x5e/0x110 fs/read_write.c:612
do_syscall_64+0x3e/0x70 arch/x86/entry/common.c:359
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x466479
Code: Bad RIP value.
RSP: 002b:00007ffd057b62b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000466479
RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003
RBP: 00000000019bf8fc R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000bf6 R14: 00000000004cb859 R15: 00000000006fefc0
Fixes: 6c41965d64
("RDMA/mlx5: Don't access ib_qp fields in internal destroy QP path")
Link: https://lore.kernel.org/r/20200707110612.882962-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-08 20:15:59 -03:00
Aya Levin
530c8632b5
IB/mlx5: Fix 50G per lane indication
...
Some released FW versions mistakenly don't set the capability that 50G per
lane link-modes are supported for VFs (ptys_extended_ethernet capability
bit).
Use PTYS.ext_eth_proto_capability instead, as this indication is always
accurate. If PTYS.ext_eth_proto_capability is valid
(has a non-zero value) conclude that the HCA supports 50G per lane.
Otherwise, conclude that the HCA doesn't support 50G per lane.
Fixes: 08e8676f16
("IB/mlx5: Add support for 50Gbps per lane link modes")
Link: https://lore.kernel.org/r/20200707110612.882962-3-leon@kernel.org
Signed-off-by: Aya Levin <ayal@mellanox.com >
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com >
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-08 20:15:58 -03:00
Leon Romanovsky
1e2b5a90de
RDMA/mlx5: Delete one-time used functions
...
Merge them into their callers, usually the only thing the caller did was
to call the one function, so this is clearer.
Link: https://lore.kernel.org/r/20200702081809.423482-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-07 14:08:03 -03:00
Leon Romanovsky
d8b7515e25
RDMA/mlx5: Cleanup DEVX initialization flow
...
Move DEVX initialization and cleanup flows to the devx.c instead of having
almost empty functions in main.c
Link: https://lore.kernel.org/r/20200702081809.423482-6-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-07 14:05:51 -03:00
Leon Romanovsky
f7c4ffda0c
RDMA/mlx5: Separate flow steering logic from main.c
...
Move flow steering logic to be in separate file and rename flow.c to be
fs.c because it is better describe the content.
Link: https://lore.kernel.org/r/20200702081809.423482-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-07 14:05:51 -03:00
Leon Romanovsky
64825827ae
RDMA/mlx5: Separate counters from main.c
...
There are number of counters types supported in mlx5_ib: HW counters,
congestion counters, Q-counters and flow counters. Almost all supporting
code was placed in main.c that made almost impossible to maintain the code
anymore. Let's create separate code namespace for the counters to easy
future generalization effort.
Link: https://lore.kernel.org/r/20200702081809.423482-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-07 14:05:51 -03:00
Leon Romanovsky
b572ebe667
RDMA/mlx5: Separate restrack callbacks initialization from main.c
...
The restrack code has separate .c, so move callbacks initialization to
that file to improve code locality.
Link: https://lore.kernel.org/r/20200702081809.423482-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-07 14:05:51 -03:00
Leon Romanovsky
ac47bf5ef1
RDMA/mlx5: Limit the scope of mlx5_ib_enable_driver function
...
The mlx5_ib_enable_driver() is local function and doesn't need to be
shared in mlx5_ib, so change it's signature to have static keyword in it.
Link: https://lore.kernel.org/r/20200702081809.423482-2-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-07 14:05:51 -03:00
Leon Romanovsky
28ad5f65c3
RDMA: Move XRCD to be under ib_core responsibility
...
Update the code to allocate and free ib_xrcd structure in the
ib_core instead of inside drivers.
Link: https://lore.kernel.org/r/20200630101855.368895-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-06 20:11:24 -03:00
Leon Romanovsky
3b023e1b68
RDMA/core: Create and destroy counters in the ib_core
...
Move allocation and destruction of counters under ib_core responsibility
Link: https://lore.kernel.org/r/20200630101855.368895-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-06 20:04:40 -03:00
Yishai Hadas
05f71ef979
RDMA/mlx5: Introduce UAPI to query PD attributes
...
Introduce UAPI to query PD attributes, this can be used to retrieve PD
attributes by having the PD handle of the created one and owning the
command FD for the ucontxet.
Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-06 19:50:34 -03:00
Yishai Hadas
0fb556b2b5
RDMA/mlx5: Implement the query ucontext functionality
...
Implement the query ucontext functionality by returning the original
ucontext data as part of an extra mlx5 attribute that holds the driver
UAPI response.
Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-06 19:50:33 -03:00
Yishai Hadas
45ec21c971
RDMA/mlx5: Refactor mlx5_ib_alloc_ucontext() response
...
Refactor mlx5_ib_alloc_ucontext() to set its response fields in a
cleaner way.
It includes,
- Move the relevant code to a self contained function.
- Calculate the response length once and drop redundant code all around.
- Reuse previously set ucontext fields once preparing the response.
The self contained function will be used in next patch as part of
implementing the query ucontext functionality.
Link: https://lore.kernel.org/r/20200630093916.332097-5-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-06 19:50:33 -03:00
Leon Romanovsky
f4375443b7
RDMA/mlx5: Get XRCD number directly for the internal use
...
The mlx5_ib creates XRC domain and uses for creating internal SRQ.
However all that is needed is XRCD number and not full blown ib_xrcd
objects.
Update the code to get and store the number only.
Link: https://lore.kernel.org/r/20200706122716.647338-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-06 19:32:23 -03:00
Gal Pressman
42a3b15396
RDMA: Remove the udata parameter from alloc_mr callback
...
Allocating an MR flow can only be initiated by kernel users, and not from
userspace so a udata parameter is redundant.
Link: https://lore.kernel.org/r/20200706120343.10816-4-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com >
Reviewed-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-06 19:25:53 -03:00
Maor Gottlieb
d473f4dc2f
RDMA/mlx5: Introduce ODP prefetch counter
...
For debugging purpose it will be easier to understand if prefetch works
okay if it has its own counter. Introduce ODP prefetch counter and count
per MR the total number of prefetched pages.
In addition remove comment which is not relevant anymore and anyway not in
the correct place.
Link: https://lore.kernel.org/r/20200621104147.53795-1-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-03 09:16:25 -03:00
Leon Romanovsky
f81b4565c1
RDMA/mlx5: Fix legacy IPoIB QP initialization
...
Legacy IPoIB sets IB_QP_CREATE_NETIF_QP QP create flag and because mlx5
doesn't use this flag, the process_create_flags() failed to create IPoIB
QPs.
Fixes: 2978975ce7
("RDMA/mlx5: Process create QP flags in one place")
Link: https://lore.kernel.org/r/20200630122147.445847-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-07-02 11:17:10 -03:00
Maor Gottlieb
28b5fa687f
RDMA/mlx5: Add support to get MR resource in RAW format
...
Add support to get MR (mkey) resource dump in RAW format.
Link: https://lore.kernel.org/r/20200623113043.1228482-12-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-06-24 08:52:29 -03:00
Maor Gottlieb
1ccecc88af
RDMA/mlx5: Add support to get CQ resource in RAW format
...
Add support to get CQ resource dump in RAW format.
Link: https://lore.kernel.org/r/20200623113043.1228482-11-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-06-24 08:52:29 -03:00
Maor Gottlieb
1776dd234a
RDMA/mlx5: Add support to get QP resource in RAW format
...
Add a generic function to use the resource dump mechanism to get the
QP resource data.
Link: https://lore.kernel.org/r/20200623113043.1228482-10-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-06-24 08:52:29 -03:00
Maor Gottlieb
f443452900
RDMA: Add dedicated MR resource tracker function
...
In order to avoid double multiplexing of the resource when it is a MR, add
a dedicated callback function.
Link: https://lore.kernel.org/r/20200623113043.1228482-5-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com >
2020-06-23 11:46:27 -03:00
Leon Romanovsky
6eefa839c4
RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udata
...
Don't deref udata if it is NULL
BUG: kernel NULL pointer dereference, address: 0000000000000030
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 SMP PTI
CPU: 2 PID: 1592 Comm: python3 Not tainted 5.7.0-rc6+ #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
RIP: 0010:create_qp+0x39e/0xae0 [mlx5_ib]
Code: c0 0d 00 00 bf 10 01 00 00 e8 be a9 e4 e0 48 85 c0 49 89 c2 0f 84 0c 07 00 00 41 8b 85 74 63 01 00 0f c8 a9 00 00 00 10 74 0a <41> 8b 46 30 0f c8 41 89 42 14 41 8b 52 18 41 0f b6 4a 1c 0f ca 89
RSP: 0018:ffffc9000067f8b0 EFLAGS: 00010206
RAX: 0000000010170000 RBX: ffff888441313000 RCX: 0000000000000000
RDX: 0000000000000200 RSI: 0000000000000000 RDI: ffff88845b1d4400
RBP: ffffc9000067fa60 R08: 0000000000000200 R09: ffff88845b1d4200
R10: ffff88845b1d4200 R11: ffff888441313000 R12: ffffc9000067f950
R13: ffff88846ac00140 R14: 0000000000000000 R15: ffff88846c2bc000
FS: 00007faa1a3c0540(0000) GS:ffff88846fd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000030 CR3: 0000000446dca003 CR4: 0000000000760ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
? __switch_to_asm+0x40/0x70
? __switch_to_asm+0x34/0x70
mlx5_ib_create_qp+0x897/0xfa0 [mlx5_ib]
ib_create_qp+0x9e/0x300 [ib_core]
create_qp+0x92d/0xb20 [ib_uverbs]
? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs]
? release_resource+0x30/0x30
ib_uverbs_create_qp+0xc4/0xe0 [ib_uverbs]
ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc8/0xf0 [ib_uverbs]
ib_uverbs_run_method+0x223/0x770 [ib_uverbs]
? track_pfn_remap+0xa7/0x100
? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs]
? remap_pfn_range+0x358/0x490
ib_uverbs_cmd_verbs.isra.6+0x19b/0x370 [ib_uverbs]
? rdma_umap_priv_init+0x82/0xe0 [ib_core]
? vm_mmap_pgoff+0xec/0x120
ib_uverbs_ioctl+0xc0/0x120 [ib_uverbs]
ksys_ioctl+0x92/0xb0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x48/0x130
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: e383085c24
("RDMA/mlx5: Set ECE options during QP create")
Link: https://lore.kernel.org/r/20200621115959.60126-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com >
2020-06-22 14:40:53 -03:00
Max Gurtovoy
9e0dc7b9e1
RDMA/mlx5: Fix integrity enabled QP creation
...
create_flags checks was refactored and broke the creation on integrity
enabled QPs and actually broke the NVMe/RDMA and iSER ULP's when using
mlx5 driven devices.
Fixes: 2978975ce7
("RDMA/mlx5: Process create QP flags in one place")
Link: https://lore.kernel.org/r/20200617130230.2846915-1-leon@kernel.org
Signed-off-by: Max Gurtovoy <maxg@mellanox.com >
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com >
2020-06-18 15:14:57 -03:00
Leon Romanovsky
2c0f5292d5
RDMA/mlx5: Remove ECE limitation from the RAW_PACKET QPs
...
Like any other QP type, rely on FW for the RAW_PACKET QPs to decide if ECE
is supported or not. This fixes an inability to create RAW_PACKET QPs with
latest rdma-core with the ECE support.
Fixes: e383085c24
("RDMA/mlx5: Set ECE options during QP create")
Link: https://lore.kernel.org/r/20200618112507.3453496-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com >
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com >
2020-06-18 14:59:12 -03:00