android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Yuval Shaia	8b62cbd13a	IB/rxe: Convert pr_info to pr_warn This message is warning so let's print it accordingly. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-18 14:01:09 -04:00
Doug Ledford	320438301b	Merge branches '32bit_lid' and 'irq_affinity' into k.o/merge-test Conflicts: drivers/infiniband/hw/mlx5/main.c - Both add new code include/rdma/ib_verbs.h - Both add new code Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-10 14:31:29 -04:00
Hiatt, Don	7db20ecd1d	IB/core: Change wc.slid from 16 to 32 bits slid field in struct ib_wc is increased to 32 bits. This enables core components to use larger LIDs if needed. The user ABI is unchanged and return 16 bit values when queried. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-08-08 14:50:25 -04:00
Mike Marciniszyn	3ffea7d8cd	IB/{rdmavt, hfi1, qib}: Fix panic with post receive and SGE compression The server side of qperf panics as follows: [242446.336860] IP: report_bug+0x64/0x10 [242446.341031] PGD 1c0c067 [242446.341032] P4D 1c0c067 [242446.343951] PUD 1c0d063 [242446.346870] PMD 8587ea067 [242446.349788] PTE 800000083e14016 [242446.352901] [242446.358352] Oops: 0003 [#1] SM [242446.437919] CPU: 1 PID: 7442 Comm: irq/92-hfi1_0 k Not tainted 4.12.0-mam-asm #1 [242446.446365] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/201 [242446.458397] task: ffff8808392d2b80 task.stack: ffffc9000664000 [242446.465097] RIP: 0010:report_bug+0x64/0x10 [242446.469859] RSP: 0018:ffffc900066439c0 EFLAGS: 0001000 [242446.475784] RAX: ffffffffa06647e4 RBX: ffffffffa06461e1 RCX: 000000000000000 [242446.483840] RDX: 0000000000000907 RSI: ffffffffa0675040 RDI: ffffffffffff740 [242446.491897] RBP: ffffc900066439e0 R08: 0000000000000001 R09: 000000000000025 [242446.499953] R10: ffffffff81a253df R11: 0000000000000133 R12: ffffc90006643b3 [242446.508010] R13: ffffffffa065bbf0 R14: 00000000000001e5 R15: 000000000000000 [242446.516067] FS: 0000000000000000(0000) GS:ffff88085f640000(0000) knlGS:000000000000000 [242446.525191] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003 [242446.531698] CR2: ffffffffa06647ee CR3: 0000000001c09000 CR4: 00000000001406e [242446.539756] Call Trace [242446.542582] fixup_bug+0x2c/0x5 [242446.546277] do_trap+0x12b/0x18 [242446.549972] do_error_trap+0x89/0x11 [242446.554171] ? hfi1_copy_sge+0x271/0x2b0 [hfi1 [242446.559324] ? ttwu_do_wakeup+0x1e/0x14 [242446.563795] ? ttwu_do_activate+0x77/0x8 [242446.568363] do_invalid_op+0x20/0x3 [242446.572448] invalid_op+0x1e/0x3 [242446.576247] RIP: 0010:hfi1_copy_sge+0x271/0x2b0 [hfi1 [242446.582075] RSP: 0018:ffffc90006643be8 EFLAGS: 0001004 [242446.587999] RAX: 0000000000000000 RBX: ffff88083e0fa240 RCX: 000000000000000 [242446.596058] RDX: 0000000000000000 RSI: ffff880842508000 RDI: ffff88083e0fa24 [242446.604116] RBP: ffffc90006643c28 R08: 0000000000000000 R09: 000000000000000 [242446.612172] R10: ffffc90009473640 R11: 0000000000000133 R12: 000000000000000 [242446.620228] R13: 0000000000000000 R14: 0000000000002000 R15: ffff88084250800 [242446.628293] ? hfi1_copy_sge+0x1a1/0x2b0 [hfi1 [242446.633449] hfi1_rc_rcv+0x3da/0x1270 [hfi1 [242446.638312] ? sc_buffer_alloc+0x113/0x150 [hfi1 [242446.643662] hfi1_ib_rcv+0x1c9/0x2e0 [hfi1 [242446.648428] process_receive_ib+0x19a/0x270 [hfi1 [242446.653866] ? process_rcv_qp_work+0xd2/0x160 [hfi1 [242446.659505] handle_receive_interrupt_nodma_rtail+0x184/0x2e0 [hfi1 [242446.666693] ? irq_finalize_oneshot+0x100/0x10 [242446.671846] receive_context_thread+0x1b/0x140 [hfi1 [242446.677576] irq_thread_fn+0x1e/0x4 [242446.681659] irq_thread+0x13c/0x1b [242446.685646] ? irq_forced_thread_fn+0x60/0x6 [242446.690604] kthread+0x112/0x15 [242446.694298] ? irq_thread_check_affinity+0xe0/0xe [242446.699738] ? kthread_park+0x60/0x6 [242446.703919] ? do_syscall_64+0x67/0x15 [242446.708292] ret_from_fork+0x25/0x3 [242446.712374] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0 [242446.733527] RIP: report_bug+0x64/0x100 RSP: ffffc900066439c [242446.739935] CR2: ffffffffa06647e [242446.743763] ---[ end trace 0e90a20d0aa494f7 ]-- The root cause is that the qib/hfi1 post receive call to rvt_lkey_ok() doesn't interpret the new return value from rvt_lkey_ok() properly leading to an mr reference count underrun. Additionally, remove an unused argument in rvt_sge_adjacent() aw well as an unneeded incr local in rvt_post_one_wr(). Fixes: Commit `14fe13fcd3` ("IB/rdmavt: Compress adjacent SGEs in rvt_lkey_ok()") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-31 15:18:38 -04:00
Doug Ledford	a5f66725c7	Merge branch 'misc' into k.o/for-next	2017-07-27 09:00:38 -04:00
Doug Ledford	f55c1e6608	Merge branches 'rxe' and 'mlx' into k.o/for-next	2017-07-26 20:13:33 -04:00
Leon Romanovsky	e1267b0124	RDMA: Remove useless MODULE_VERSION All modules in drivers/infiniband defined and used MODULE_VERSION, which was pointless because the kernel version describes their state more accurate then those arbitrary numbers. Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sagi Grimbrg <sagi@grimberg.me> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:45:11 -04:00
Yuval Shaia	d41861942f	IB/core: Add generic function to extract IB speed from netdev Logic of retrieving netdev speed from net_device and translating it to IB speed is implemented in rxe, in usnic and in bnxt drivers. Define new function which merges all. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Christian Benvenuti <benve@cisco.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:45:11 -04:00
Kamal Heib	b4fbec9673	IB/rxe: Constify static rxe_vm_ops Constify static rxe_vm_ops that is never modified. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:43:12 -04:00
Kamal Heib	61013828f6	IB/rxe: Use __func__ to print function's name Its better to use __func__ to print functions name instead of writing the name in the print statement. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:43:12 -04:00
Kamal Heib	c05d26647b	IB/rxe: Use DEVICE_ATTR_RO macro to show parent field Use DEVICE_ATTR RO() macro and rename the show function accordingly. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:43:12 -04:00
Kamal Heib	c498e82e3c	IB/rxe: Prefer 'unsigned int' to bare use of 'unsigned' Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:43:12 -04:00
Kamal Heib	3363828758	IB/rxe: Use "foo bar" instead of "foo bar" Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-24 08:43:12 -04:00
Doug Ledford	03da084ed8	Merge branch 'hfi1' into k.o/for-4.14	2017-07-24 08:33:43 -04:00
Vijay Immanuel	1217197142	rxe: fix broken receive queue draining If we modified the qp to ERROR state, and drained the recieve queue, post_recv must trigger the responder task to complete the drain work request. Cc: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-20 11:20:50 -04:00
Kaike Wan	a25ce4270b	IB/rdmavt: Setting of QP timeout can overflow jiffies computation Current computation of qp->timeout_jiffies in rvt_modify_qp() will cause overflow due to the fact that the input to the function usecs_to_jiffies is only 32-bit ( unsigned int). Overflow will occur when attr->timeout is equal to or greater than 30. The consequence is unnecessarily excessive retry and thus degradation of the system performance. This patch fixes the problem by limiting the input to 5-bit and calling usecs_to_jiffies() before multiplying the scaling factor. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-20 11:20:50 -04:00
yonatanc	56012e1cad	IB/rxe: Set dma_mask and coherent_dma_mask The RXE coupled with dummy device causes to the kernel panic attached below. The panic happens when ib_register_device tries to set dma_mask by accessing a NULLed parent device. The RXE does not actually use DMA, so we can set the dma_mask to architecture value. [16240.199689] RIP: 0010:ib_register_device+0x468/0x5a0 [ib_core] [16240.205289] RSP: 0018:ffffc9000220fc10 EFLAGS: 00010246 [16240.209909] RAX: 0000000000000024 RBX: ffff880220d1a2a8 RCX: 0000000000000000 [16240.212244] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 [16240.214385] RBP: ffffc9000220fcb0 R08: 0000000000000000 R09: 000000000000023f [16240.254465] R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000000 [16240.259467] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880220d1a2a8 [16240.263314] FS: 00007fd8ecca0740(0000) GS:ffff8802364c0000(0000) knlGS:0000000000000000 [16240.267292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [16240.273503] CR2: 0000000000000218 CR3: 00000002253ba000 CR4: 00000000000006e0 [16240.277066] Call Trace: [16240.281836] ? __kmalloc+0x26f/0x280 [16240.286596] rxe_register_device+0x297/0x300 [rdma_rxe] [16240.291377] rxe_add+0x535/0x5b0 [rdma_rxe] [16240.297586] rxe_net_add+0x3e/0xc0 [rdma_rxe] [16240.302375] rxe_param_set_add+0x65/0x144 [rdma_rxe] [16240.307769] param_attr_store+0x68/0xd0 [16240.311640] module_attr_store+0x1d/0x30 [16240.316421] sysfs_kf_write+0x3a/0x50 [16240.317802] kernfs_fop_write+0xff/0x180 [16240.322989] __vfs_write+0x37/0x140 [16240.328164] ? handle_mm_fault+0xce/0x240 [16240.333340] vfs_write+0xb2/0x1b0 [16240.335013] SyS_write+0x55/0xc0 [16240.340632] entry_SYSCALL_64_fastpath+0x1a/0xa9 Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-17 21:21:27 -04:00
Yonatan Cohen	fda85ce912	IB/rxe: Fix kernel panic from skb destructor In the time between rxe_send has finished and skb destructor called, the QP's ref count might be 0, leading to a possible QP destruction. This will lead to a kernel panic when the destructor dereferences the QP. The operation of incrementing QP ref count at rxe_send and decrementing from skb destructor will prevent this crash. BUG: unable to handle kernel NULL pointer dereference at 000000000000072c IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] PGD 0 [16240.211178] Oops: 0002 [#1] SMP CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 4.9.0-mlnx #1 Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011 task: ffff88042d6b1480 task.stack: ffffc90001904000 RIP: 0010:[<ffffffffa05df765>] [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] RSP: 0018:ffff88043fcc3df0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200 RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700 RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6 R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700 R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200 FS: 0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0 Stack: ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2 ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700 ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96 Call Trace: <IRQ> [16240.336345] [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0 [<ffffffff817b42c2>] skb_release_all+0x12/0x30 [<ffffffff817b4332>] kfree_skb+0x32/0x90 [<ffffffff81893f96>] ndisc_error_report+0x36/0x40 [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0 [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0 [<ffffffff81109295>] call_timer_fn+0x35/0x120 [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460 [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30 [<ffffffff810366b9>] ? sched_clock+0x9/0x10 [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0 [<ffffffff818dd537>] __do_softirq+0xd7/0x289 [<ffffffff810a6c95>] irq_exit+0xb5/0xc0 [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50 [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90 <EOI> [16240.395776] [<ffffffff818da156>] ? native_safe_halt+0x6/0x10 [<ffffffff818d9e6e>] default_idle+0x1e/0xd0 [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20 [<ffffffff818da2c5>] default_idle_call+0x35/0x40 [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210 [<ffffffff81050433>] start_secondary+0x103/0x130 RIP [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-17 21:21:26 -04:00
Leon Romanovsky	0f4d027c3b	IB/{rdmavt, qib, hfi1}: Remove gfp flags argument The caller to the driver marks GFP_NOIO allocations with help of memalloc_noio-* calls now. This makes redundant to pass down to the driver gfp flags, which can be GFP_KERNEL only. The patch removes the gfp flags argument and updates all driver paths. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-07-17 21:21:23 -04:00
Kees Cook	4c93496f18	IB/rxe: do not copy extra stack memory to skb This fixes a over-read condition detected by FORTIFY_SOURCE for this line: memcpy(SKB_TO_PKT(skb), &ack_pkt, sizeof(skb->cb)); The error was: In file included from ./include/linux/bitmap.h:8:0, from ./include/linux/cpumask.h:11, from ./include/linux/mm_types_task.h:13, from ./include/linux/mm_types.h:4, from ./include/linux/kmemcheck.h:4, from ./include/linux/skbuff.h:18, from drivers/infiniband/sw/rxe/rxe_resp.c:34: In function 'memcpy', inlined from 'send_atomic_ack.constprop' at drivers/infiniband/sw/rxe/rxe_resp.c:998:2, inlined from 'acknowledge' at drivers/infiniband/sw/rxe/rxe_resp.c:1026:3, inlined from 'rxe_responder' at drivers/infiniband/sw/rxe/rxe_resp.c:1286:10: ./include/linux/string.h:309:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter __read_overflow2(); Daniel Micay noted that struct rxe_pkt_info is 32 bytes on 32-bit architectures, but skb->cb is still 64. The memcpy() over-reads 32 bytes. This fixes it by zeroing the unused bytes in skb->cb. Link: http://lkml.kernel.org/r/1497903987-21002-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Moni Shoua <monis@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-07-12 16:26:03 -07:00
Dennis Dalessandro	6c31e5283c	IB/hfi1: Use QPN mask to avoid overflow Ensure we can't come up with an array size that is bigger than the array by applying the QPN mask before the divide in the free_qpn function. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-27 16:58:12 -04:00
Dennis Dalessandro	b2f8a04e77	IB/rdmavt: Remove duplicated functions The free_qpn() function from the hfi1/qib driver which was the basis for rdmavt_free_qpn() function was accidentally left in the code. Remove it. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-27 16:58:12 -04:00
Vishwanathapura, Niranjana	cb49366f36	IB/core,rdmavt,hfi1,opa-vnic: Send OPA cap_mask3 in trap Provide the ability for IB clients to modify the OPA specific capability mask and include this mask in the subsequent trap data. Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Michael N. Henry <michael.n.henry@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-27 16:58:12 -04:00
Mike Marciniszyn	14fe13fcd3	IB/rdmavt: Compress adjacent SGEs in rvt_lkey_ok() SGEs that are contiguous needlessly consume driver dependent TX resources. The lkey validation logic is enhanced to compress the SGE that ends up in the send wqe when consecutive addresses are detected. The lkey validation API used to return 1 (success) or 0 (fail). The return value is now an -errno, 0 (compressed), or 1 (uncompressed). A additional argument is added to pass the last SQE for the compression. Loopback callers always pass a NULL to last_sge since the optimization is of little benefit in that situation. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-27 16:56:33 -04:00
Doug Ledford	8c32c4f2f7	Merge branch 'k.o/for-4.12-rc' into k.o/for-4.13-mlx-shared	2017-06-27 16:55:59 -04:00
Linus Torvalds	51ce5f3329	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "I had thought at the time of the last pull request that there wouldn't be much more to go, but several things just kept trickling in over the last week. Instead of just the six patches to bnxt_re that I had anticipated, there are another five IPoIB patches, two qedr patches, and a few other miscellaneous patches. The bnxt_re patches are more lines of diff than I like to submit this late in the game. That's mostly because of the first two patches in the series of six. I almost dropped them just because of the lines of churn, but on a close review, a lot of the churn came from removing duplicated code sections and consolidating them into callable routines. I felt like this made the number of lines of change more acceptable, and they address problems, so I left them. The remainder of the patches are all small, well contained, and well understood. These have passed 0day testing, but have not been submitted to linux-next (but a local merge test with your current master was without any conflicts). Summary: - A fix for fix `eea40b8f62` ("infiniband: call ipv6 route lookup via the stub interface") - Six patches against bnxt_re...the first two are considerably larger than I would like, but as they address real issues I went ahead and submitted them (it also helped that a good deal of the churn was removing code repeated in multiple places and consolidating it to one common function) - Two fixes against qedr that just came in - One fix against rxe that took a few revisions to get right plus time to get the proper reviews - Five late breaking IPoIB fixes - One late cxgb4 fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: rdma/cxgb4: Fix memory leaks during module exit IB/ipoib: Fix memory leak in create child syscall IB/ipoib: Fix access to un-initialized napi struct IB/ipoib: Delete napi in device uninit default IB/ipoib: Limit call to free rdma_netdev for capable devices IB/ipoib: Fix memory leaks for child interfaces priv rxe: Fix a sleep-in-atomic bug in post_one_send RDMA/qedr: Add 64KB PAGE_SIZE support to user-space queues RDMA/qedr: Initialize byte_len in WC of READ and SEND commands RDMA/bnxt_re: Remove FMR support RDMA/bnxt_re: Fix RQE posting logic RDMA/bnxt_re: Add HW workaround for avoiding stall for UD QPs RDMA/bnxt_re: Dereg MR in FW before freeing the fast_reg_page_list RDMA/bnxt_re: HW workarounds for handling specific conditions RDMA/bnxt_re: Fixing the Control path command and response handling IB/addr: Fix setting source address in addr6_resolve()	2017-06-16 17:38:23 +09:00
Jia-Ju Bai	07d432bb97	rxe: Fix a sleep-in-atomic bug in post_one_send The driver may sleep under a spin lock, and the function call path is: post_one_send (acquire the lock by spin_lock_irqsave) init_send_wqe copy_from_user --> may sleep There is no flow that makes "qp->is_user" true, and copy_from_user may cause bug when a non-user pointer is used. So the lines of copy_from_user and check of "qp->is_user" are removed. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-06-14 13:02:01 -04:00
David Miller	d41519a69b	crypto: Work around deallocated stack frame reference gcc bug on sparc. On sparc, if we have an alloca() like situation, as is the case with SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack memory. The result can be that the value is clobbered if a trap or interrupt arrives at just the right instruction. It only occurs if the function ends returning a value from that alloca() area and that value can be placed into the return value register using a single instruction. For example, in lib/libcrc32c.c:crc32c() we end up with a return sequence like: return %i7+8 lduw [%o5+16], %o0 ! MEM[(u32 *)__shash_desc.1_10 + 16B], %o5 holds the base of the on-stack area allocated for the shash descriptor. But the return released the stack frame and the register window. So if an intererupt arrives between 'return' and 'lduw', then the value read at %o5+16 can be corrupted. Add a data compiler barrier to work around this problem. This is exactly what the gcc fix will end up doing as well, and it absolutely should not change the code generated for other cpus (unless gcc on them has the same bug :-) With crucial insight from Eric Sandeen. Cc: <stable@vger.kernel.org> Reported-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-06-08 17:36:03 +08:00
Sagi Grimberg	67cf3623e0	rxe: expose num_possible_cpus() cnum_comp_vectors They're completely logical, so don't impose an artificial limitation. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-04 19:33:02 -04:00
Leon Romanovsky	af5df5fb59	IB/rxe: Update caller's CRC for RXE_MEM_TYPE_DMA memory type Callers of rxe_mem_copy() provide pointer to store updated CRC value. That pointer was supposed to be updated, but the commit `cee2688e3c` ("IB/rxe: Offload CRC calculation when possible") mistakenly removed that assignment for RXE_MEM_TYPE_DMA memory type. The code worked because there are no actual callers with RXE_MEM_TYPE_DMA, who are interested in returned value of crcp. The one caller in read_reply(), who uses the returned crcp didn't set RXE_MEM_TYPE_DMA as mem->type. Fixes: `cee2688e3c` ("IB/rxe: Offload CRC calculation when possible") Reported-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-04 19:31:46 -04:00
Johannes Thumshirn	d52418502e	IB/rxe: Don't clamp residual length to mtu When reading a RDMA WRITE FIRST packet we copy the DMA length from the RDMA header into the qp->resp.resid variable for later use. Later in check_rkey() we clamp it to the MTU if the packet is an RDMA WRITE packet and has a residual length bigger than the MTU. Later in write_data_in() we subtract the payload of the packet from the residual length. If the packet happens to have a payload of exactly the MTU size we end up with a residual length of 0 despite the packet not being the last in the conversation. When the next packet in the conversation arrives, we don't have any residual length left and thus set the QP into an error state. This broke NVMe over Fabrics functionality over rdma_rxe.ko The patch was verified using the following test. # echo eth0 > /sys/module/rdma_rxe/parameters/add # nvme connect -t rdma -a 192.168.155.101 -s 1023 -n nvmf-test # mkfs.xfs -fK /dev/nvme0n1 meta-data=/dev/nvme0n1 isize=256 agcount=4, agsize=65536 blks = sectsz=4096 attr=2, projid32bit=1 = crc=0 finobt=0, sparse=0 data = bsize=4096 blocks=262144, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 # mount /dev/nvme0n1 /tmp/ [ 148.923263] XFS (nvme0n1): Mounting V4 Filesystem [ 148.961196] XFS (nvme0n1): Ending clean mount # dd if=/dev/urandom of=test.bin bs=1M count=128 128+0 records in 128+0 records out 134217728 bytes (134 MB, 128 MiB) copied, 0.437991 s, 306 MB/s # sha256sum test.bin cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a test.bin # cp test.bin /tmp/ sha256sum /tmp/test.bin cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a /tmp/test.bin Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Max Gurtovoy <maxg@mellanox.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-01 14:42:58 -04:00
Dasaratharaman Chandramouli	44c58487d5	IB/core: Define 'ib' and 'roce' rdma_ah_attr types rdma_ah_attr can now be either ib or roce allowing core components to use one type or the other and also to define attributes unique to a specific type. struct ib_ah is also initialized with the type when its first created. This ensures that calls such as modify_ah dont modify the type of the address handle attribute. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli	d8966fcd4c	IB/core: Use rdma_ah_attr accessor functions Modify core and driver components to use accessor functions introduced to access individual fields of rdma_ah_attr Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli	3652315934	IB/core: Rename ib_destroy_ah to rdma_destroy_ah Rename ib_destroy_ah to rdma_destroy_ah so its in sync with the rename of the ib address handle attribute Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli	90898850ec	IB/core: Rename struct ib_ah_attr to rdma_ah_attr This patch simply renames struct ib_ah_attr to rdma_ah_attr as these fields specify attributes that are not necessarily specific to IB. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-01 14:32:43 -04:00
Dasaratharaman Chandramouli	eca7ddf965	IB/rxe: Initialize ib_ah_attr during query_ah Zero out ib_ah_attr before calling query_ah. Set ah_flags appropriately. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Don Hiatt <don.hiatt@intel.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-05-01 14:32:43 -04:00
Michael J. Ruhl	aad9ff97dd	IB/rdmavt/hfi1/qib: Use the MGID and MLID for multicast addressing The Infiniband spec defines "A multicast address is defined by a MGID and a MLID" (section 10.5). The current code only uses the MGID for identifying multicast groups. Update the driver to be compliant with this definition. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-28 13:48:01 -04:00
Colin Ian King	27b0b83233	IB/rxe: fix typo: "algorithmi" -> "algorithm" trivial fix to typo in pr_err message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-28 13:07:58 -04:00
Dan Carpenter	f0bb2d44ca	IB/rdmavt: restore IRQs on error path in rvt_create_ah() We need to call spin_unlock_irqrestore() instead of vanilla spin_unlock() on this error path. Fixes: `119a8e708d` ("IB/rdmavt: Add AH to rdmavt") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-28 13:01:58 -04:00
Artemy Kovalyov	3e7e1193e2	IB: Replace ib_umem page_size by page_shift Size of pages are held by struct ib_umem in page_size field. It is better to store it as an exponent, because page size by nature is always power-of-two and used as a factor, divisor or ilog2's argument. The conversion of page_size to be page_shift allows to have portable code and avoid following error while compiling on ARM: ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined! CC: Selvin Xavier <selvin.xavier@broadcom.com> CC: Steve Wise <swise@chelsio.com> CC: Lijun Ou <oulijun@huawei.com> CC: Shiraz Saleem <shiraz.saleem@intel.com> CC: Adit Ranadive <aditr@vmware.com> CC: Dennis Dalessandro <dennis.dalessandro@intel.com> CC: Ram Amrani <Ram.Amrani@Cavium.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-25 15:40:28 -04:00
Yuval Shaia	4d6f28591f	{net,IB}/{rxe,usnic}: Utilize generic mac to eui32 function This logic seems to be duplicated in (at least) three separate files. Move it to one place so code can be re-use. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>	2017-04-25 14:21:34 -04:00
yonatanc	4ed6ad1eb3	IB/rxe: Cache dst in QP instead of getting it for each send In RC QP there is no need to resolve the outgoing interface for each packet, as this does not change during QP life cycle. Instead cache the interface on the socket and use that one. This improves performance by 12% by sparing redundant calls to rxe_find_route. ib_send_bw -d rxe0 -x 1 -n 9000 -e -s $((1024 * 1024 )) -l 100 ---------------------------------------------------------------------------------------- \| \| bytes \| iterations \| BW peak[MB/sec] \| BW average[MB/sec] \| MsgRate[Mpps] \| ---------------------------------------------------------------------------------------- \| before \| 1048576 \| 9000 \| inf \| 551.21 \| 0.000551 \| \| after \| 1048576 \| 9000 \| inf \| 615.54 \| 0.000616 \| ---------------------------------------------------------------------------------------- Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-21 10:45:02 -04:00
yonatanc	cee2688e3c	IB/rxe: Offload CRC calculation when possible Use CPU ability to perform CRC calculations, by replacing direct calls to crc32_le() with crypto_shash_updata(). The overall performance gain measured with ib_send_bw tool is 10% and it was tested on "Intel CPU ES-2660 v2 @ 2.20Ghz" CPU. ib_send_bw -d rxe0 -x 1 -n 9000 -e -s $((1024 * 1024 )) -l 100 --------------------------------------------------------------------------------------------- \| \| bytes \| iterations \| BW peak[MB/sec] \| BW average[MB/sec] \| MsgRate[Mpps] \| --------------------------------------------------------------------------------------------- \| crc32_le \| 1048576 \| 9000 \| inf \| 497.60 \| 0.000498 \| \| CRC offload \| 1048576 \| 9000 \| inf \| 546.70 \| 0.000547 \| --------------------------------------------------------------------------------------------- Fixes: `8700e3e7c4` ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-21 10:45:02 -04:00
Parav Pandit	0d38ac8a8b	IB/rxe: Do not export module's private function Function rxe_rcv is used internally in RXE and don't need to be exported. This patch removes such export declaration. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-21 10:43:28 -04:00
Parav Pandit	99fc12f60e	IB/rxe: Avoid accessing timers for non RC QPs This patch avoids RNR NAK timer and retransmit timer initialization and cleanup for non RC QPs (such as UD QP, GSI QP). Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-21 10:43:28 -04:00
Yonatan Cohen	0b1e5b99a4	IB/rxe: Add port protocol stats Expose new counters using the get_hw_stats callback. We expose the following counters: +---------------------+----------------------------------------+ \| Name \| Description \| \|---------------------+----------------------------------------\| \|sent_pkts \| number of sent pkts \| \|---------------------+----------------------------------------\| \|rcvd_pkts \| number of received packets \| \|---------------------+----------------------------------------\| \|out_of_sequence \| number of errors due to packet \| \| \| transport sequence number \| \|---------------------+----------------------------------------\| \|duplicate_request \| number of received duplicated packets. \| \| \| A request that previously executed is \| \| \| named duplicated. \| \|---------------------+----------------------------------------\| \|rcvd_rnr_err \| number of received RNR by completer \| \|---------------------+----------------------------------------\| \|send_rnr_err \| number of sent RNR by responder \| \|---------------------+----------------------------------------\| \|rcvd_seq_err \| number of out of sequence packets \| \| \| received \| \|---------------------+----------------------------------------\| \|ack_deffered \| number of deferred handling of ack \| \| \| packets. \| \|---------------------+----------------------------------------\| \|retry_exceeded_err \| number of times retry exceeded \| \|---------------------+----------------------------------------\| \|completer_retry_err \| number of times completer decided to \| \| \| retry \| \|---------------------+----------------------------------------\| \|send_err \| number of failed send packet \| +---------------------+----------------------------------------+ Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-21 10:43:28 -04:00
Doug Ledford	23790ba2d7	Merge branch 'k.o/for-4.12' into k.o/for-4.12-rdma-netdevice	2017-04-20 12:00:41 -04:00
Mike Marciniszyn	b58fc80497	IB/hfi1: Eliminate synchronize_rcu() in mr delete The synchronize_rcu() call can be eliminated to improve memory deregistration performance. There are two key fields involved: - The rcu pointer itself - the lkey_published field To close the window between the rcu read of the mregion pointer and the reference count the code should: 1. To lkey/rkey validation (reader) Read the rcu pointer. If the pointer is non-NULL, get a reference. To the current validation tests use a READ_ONCE() on the lkey_published. Upon any failure release the reference. 2. To the remove logic (delete) Insure the published is zeroed prior to setting the pointer to NULL. This requires using rcu_assign_pointer() to insure lkey_published is written prior to the NULL. 3. To the insert logic (add) Insure the published is set use an rcu_assign_pointer() to insure the pointer is after all MR fields. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-05 14:45:09 -04:00
Mike Marciniszyn	44dcfa4b18	IB/rdmavt: Avoid reseting wqe send_flags in unreserve The wqe should be read only and in fact the superfluous reset of the RVT_SEND_RESERVE_USED flag causes an issue where reserved operations elicit a bad completion to the ULP. The maintenance of the flag is now entirely within rvt_post_one_wr() where a reserved operation will set the flag and a non-reserved operation will insure the operation that is about to be posted has the flag reset. Fixes: Commit `856cc4c237` ("IB/hfi1: Add the capability for reserved operations") Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-05 14:45:09 -04:00
Sebastian Sanchez	5f14e4e667	IB/rdmavt, IB/hfi1: Fix timer migration regressions RC timeout counter isn't getting incremented. Increment counter and add the trace for it. Fixes: 87c23b4ab018 ("IB/rdmavt: Adding timer logic to rdmavt") Reviewed-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-04-05 14:45:09 -04:00

... 2 3 4 5 6 ...

404 Commits