Commit Graph

1417 Commits

Author SHA1 Message Date
Saeed Mahameed
1a412fb1ca {net,IB}/mlx5: Modify QP commands via mlx5 ifc
Prior to this patch we assumed that modify QP commands have the
same layout.

In ConnectX-4 for each QP transition there is a specific command
and their layout can vary.

e.g: 2err/2rst commands don't have QP context in their layout and before
this patch we posted the QP context in those commands.

Fortunately the FW only checks the suffix of the commands and executes
them, while ignoring all invalid data sent after the valid command
layout.

This patch removes mlx5_modify_qp_mbox_in and changes
mlx5_core_qp_modify to receive the required transition and QP context
with opt_param_mask if needed.  This way the caller is not required to
provide the command inbox layout and it will be generated automatically.

mlx5_core_qp_modify will generate the command inbox/outbox layouts
according to the requested transition and will fill the requested
parameters.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2016-08-17 17:45:58 +03:00
Saeed Mahameed
09a7d9eca1 {net,IB}/mlx5: QP/XRCD commands via mlx5 ifc
Remove old representation of manually created QP/XRCD commands layout
amd use mlx5_ifc canonical structures and defines.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2016-08-17 17:45:57 +03:00
Saeed Mahameed
ec22eb5310 {net,IB}/mlx5: MKey/PSV commands via mlx5 ifc
Remove old representation of manually created MKey/PSV commands layout,
and use mlx5_ifc canonical structures and defines.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2016-08-14 14:39:18 +03:00
Saeed Mahameed
2782778663 {net,IB}/mlx5: CQ commands via mlx5 ifc
Remove old representation of manually created CQ commands layout,
and use mlx5_ifc canonical structures and defines.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2016-08-14 14:39:15 +03:00
Linus Torvalds
0cda611386 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull base rdma updates from Doug Ledford:
 "Round one of 4.8 code: while this is mostly normal, there is a new
  driver in here (the driver was hosted outside the kernel for several
  years and is actually a fairly mature and well coded driver).  It
  amounts to 13,000 of the 16,000 lines of added code in here.

  Summary:

   - Updates/fixes for iw_cxgb4 driver
   - Updates/fixes for mlx5 driver
   - Add flow steering and RSS API
   - Add hardware stats to mlx4 and mlx5 drivers
   - Add firmware version API for RDMA driver use
   - Add the rxe driver (this is a software RoCE driver that makes any
     Ethernet device a RoCE device)
   - Fixes for i40iw driver
   - Support for send only multicast joins in the cma layer
   - Other minor fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (72 commits)
  Soft RoCE driver
  IB/core: Support for CMA multicast join flags
  IB/sa: Add cached attribute containing SM information to SA port
  IB/uverbs: Fix race between uverbs_close and remove_one
  IB/mthca: Clean up error unwind flow in mthca_reset()
  IB/mthca: NULL arg to pci_dev_put is OK
  IB/hfi1: NULL arg to sc_return_credits is OK
  IB/mlx4: Add diagnostic hardware counters
  net/mlx4: Query performance and diagnostics counters
  net/mlx4: Add diagnostic counters capability bit
  Use smaller 512 byte messages for portmapper messages
  IB/ipoib: Report SG feature regardless of HW UD CSUM capability
  IB/mlx4: Don't use GFP_ATOMIC for CQ resize struct
  IB/hfi1: Disable by default
  IB/rdmavt: Disable by default
  IB/mlx5: Fix port counter ID association to QP offset
  IB/mlx5: Fix iteration overrun in GSI qps
  i40iw: Add NULL check for puda buffer
  i40iw: Change dup_ack_thresh to u8
  i40iw: Remove unnecessary check for moving CQ head
  ...
2016-08-04 20:10:31 -04:00
Doug Ledford
3e5e8e8a9a Merge branches 'cxgb4' and 'mlx5' into k.o/for-4.8 2016-08-03 20:58:45 -04:00
Alex Vesker
321a9e3ebc IB/mlx5: Fix port counter ID association to QP offset
The q-counter-id is given in modify-QP command associates
the QP with the counter. The offset to which the counter
ID was set is incorrect, causing IB port counters not to
count on QP.

Fixes: 0837e86a7a ('IB/mlx5: Add per port counters')
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Tested-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02 14:40:49 -04:00
Slava Shwartsman
b0ffeb537f IB/mlx5: Fix iteration overrun in GSI qps
Number of outstanding_pi may overflow and as a result may indicate that
there are no elements in the queue. The effect of doing this is that the
MAD layer will get stuck waiting for completions. The MAD layer will
think that the QP is full - because it didn't receive these completions.

This fix changes it so the outstanding_pi number is increased
with 32-bit wraparound and is not limited to max_send_wr so
that the difference between outstanding_pi and outstanding_ci will
really indicate the number of outstanding completions.

Cc: Stable <stable@vger.kernel.org>
Fixes: ea6dc20362 ('IB/mlx5: Reorder GSI completions')
Signed-off-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02 14:32:51 -04:00
Wei Yongjun
619615005e IB/mlx5: Fix duplicate const warning
Fixes the following sparse warning:

drivers/infiniband/hw/mlx5/main.c:2574:25: warning: duplicate const

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-02 13:00:33 -04:00
Maor Gottlieb
c5bb17302e net/mlx5: Refactor mlx5_add_flow_rule
Reduce the set of arguments passed to mlx5_add_flow_rule
by introducing flow_spec structure.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-05 00:06:02 -07:00
Doug Ledford
fb92d8fb1b Merge branches 'cxgb4-4.8', 'mlx5-4.8' and 'fw-version' into k.o/for-4.8 2016-06-23 12:29:26 -04:00
Ira Weiny
c73428230d IB/mlx5: Support device FW version string
And remove sysfs entry in favor of the common code.

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 12:08:33 -04:00
Mark Bloch
0ad17a8f7f IB/mlx5: Add port protocol stats
Expose new counters using the get protocol stats callback.
We expose the following counters:

|------------------------------------------------------------------------|
|      Name           | IB | EN |           Description                  |
|------------------------------------------------------------------------|
|rx_write_requests    | +  | -  | Number of received WRITE requests for  |
|                     |    |    | the associated QP.                     |
|------------------------------------------------------------------------|
|rx_read_requests     | +  | -  | Number of received READ requests for   |
|                     |    |    | the associated QP.                     |
|------------------------------------------------------------------------|
|rx_atomic_requests   | +  | -  | Number of received ATOMIC requests for |
|                     |    |    | the associated QP.                     |
|------------------------------------------------------------------------|
|out_of_buffer        | +  | +  | Number of drops occurred due to lack   |
|                     |    |    | of WQE for the associated QPs/RQs.     |
|------------------------------------------------------------------------|
|out_of_sequence      | +  | -  | Number of errors in the packet         |
|                     |    |    | transport sequence number              |
|------------------------------------------------------------------------|
|duplicate_request    | +  | +  | Number of received duplicated packets. |
|                     |    |    | A request that previously executed is  |
|                     |    |    | named duplicated.                      |
|------------------------------------------------------------------------|
|rnr_nak_retry_err    | +  | +  | Number of received RNR NAC packets.    |
|                     |    |    | The QP retry limit did not exceed.     |
|------------------------------------------------------------------------|
|packet_seq_err       | +  | +  | Number of received NAK - sequence error|
|                     |    |    | packets. The QP retry limit did not    |
|                     |    |    | exceed.                                |
|------------------------------------------------------------------------|
|implied_nak_err      | +  | +  | Number of times the requester detected |
|                     |    |    | an ACK with a PSN larger than expected |
|                     |    |    | PSN for RDMA READ or ATOMIC response   |
|                     |    |    | The QP retry limit did not exceed.     |
|------------------------------------------------------------------------|
|local_ack_timeout_err| +  | -  | Number of NO ACK responses from        |
|                     |    |    | responder within timer interval.       |
|                     |    |    | The QP retry limit did not exceed.     |
|------------------------------------------------------------------------|

Counters are available if all of them are supported.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:40:00 -04:00
Mark Bloch
0837e86a7a IB/mlx5: Add per port counters
In order to support statistics for ports, we attach
each QP to a counter set which is dedicate to this port.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:39:14 -04:00
Artemy Kovalyov
af1ba291c5 {net, IB}/mlx5: Refactor internal SRQ API
Currently, the SRQ API uses the obsolete mlx5_*_srq_mbox_{in,out}
structs which limit the ability to pass the SRQ attributes between
net and IB parts of the driver.

This patch changes the SRQ API so as to use auto-generated structs
and provides a better way to pass attributes which will be in use by
coming features.

Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:20:07 -04:00
Bodong Wang
402ca53644 IB/mlx5: Report mlx5 TSO capabilities when querying device
Enable mlx5 based hardware to report TCP segmentation offload (TSO)
capabilities from kernel to user space. A TSO enabled NIC will accept
big chunks of data with sizes greater than MTU for TCP traffic.  The TSO
engine will break the data into separate packets and will insert headers
automatically.

The capabilities are exposed to user space through query_device by uhw
directly. The following capabilities are reported:

1. The maximum payload size in bytes supported for segmentation by TSO
   engine.
2. Bitmap showing which QP types are supported by TSO operation. The bitmap
   is built by members from 'enmu ib_qp_type'. For example, similar code
   should be performed if UD QP is supported:
	supported_qpts |= 1 << IB_QPT_UD;

To make user-space library aware of whether kernel supports uhw or not, a
new flag: cmds_supp_uhw will be returned back to user-space through
alloc_ucontext.

Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:09:18 -04:00
Maor Gottlieb
026bae0cb4 IB/mlx5: Enable flow steering for IPv6 traffic
Enable flow steering for IPv6 traffic by using an IPv6 spec.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:45 -04:00
Maor Gottlieb
89ea94a7b6 IB/mlx5: Reset flow support for IB kernel ULPs
The driver exposes interfaces that directly relate to HW state.
Upon fatal error, consumers of these interfaces (ULPs) that rely
on completion of all their posted work-request could hang, thereby
introducing dependencies in shutdown order. To prevent this from
happening, we manage the relevant resources (CQs, QPs) that are used
by the device. Upon a fatal error, we now generate simulated
completions for outstanding WQEs that were not completed at the
time the HW was reset.

It includes invoking the completion event handler for all involved
CQs so that the ULPs will poll those CQs. When polled we return
simulated CQEs with IB_WC_WR_FLUSH_ERR return code enabling ULPs
to clean up their  resources and not wait forever for completions
upon receiving remove_one.

The above change requires an extra check in the data path to make
sure that when device is in error state, the simulated CQEs will
be returned and no further WQEs will be posted.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:45 -04:00
Maor Gottlieb
7c2344c3bb IB/mlx5: Implements disassociate_ucontext API
Implements the IB core disassociate_ucontext API.
The driver detaches the HW resources for a given user context to
prevent a dependency between application termination and device
disconnect. This is done by managing the VMAs that were mapped
to the HW bars such as doorbell and blueflame. When need to detach,
remap them to an arbitrary kernel page returned by the zap API.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:44 -04:00
Yishai Hadas
28d6137008 IB/mlx5: Add RSS QP support
Add support for Raw Ethernet RX HASH QP. Currently, creation and
destruction of such a QP are supported. This QP is implemented as
a simple TIR object which points to the receive RQ indirection table.
The given hashing configuration is used to configure the TIR and by
that it chooses the right RQ from the RQ indirection table.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:44 -04:00
Yishai Hadas
c5f9092936 IB/mlx5: Add Receive Work Queue Indirection table operations
Some mlx5 based hardwares support a RQ table object. This RQ table
points to a few RQ objects. We implement the receive work queue
indirection table API (create and destroy) by using this hardware
object.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:43 -04:00
Yishai Hadas
79b20a6c30 IB/mlx5: Add receive Work Queue verbs
A QP can be created without internal WQs "packaged" inside it,
this QP can be configured to use "external" WQ object as its
receive/send queue.

WQ is a necessary component for RSS technology since RSS mechanism
is supposed to distribute the traffic between multiple
Receive Work Queues

Receive WQs are implemented by RQs.

Implement the WQ creation, modification and destruction verbs.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 11:02:43 -04:00
Talat Batheesh
00bf534fce IB/mlx5: Fix wrong naming of port_rcv_data counter
port_xmit_data is written instead of port_rcv_data.

Fixes: 3efd9a1121 ('IB/mlx5: Modify MAD reading counters method to use counter registers')
Signed-off-by: Talat Batheesh <talatb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 10:03:57 -04:00
Eli Cohen
c9b254955b IB/mlx5: Fix post send fence logic
If the caller specified IB_SEND_FENCE in the send flags of the work
request and no previous work request stated that the successive one
should be fenced, the work request would be executed without a fence.
This could result in RDMA read or atomic operations failure due to a MR
being invalidated. Fix this by adding the mlx5 enumeration for fencing
RDMA/atomic operations and fix the logic to apply this.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23 10:03:57 -04:00
Achiad Shochat
f879ee8d90 IB/mlx5: Fix alternate path code
Userspace flag IBV_QP_ALT_PATH is supposed to set the alternate path
including fields alt_pkey_index and alt_timeout.
Added IB_QP_PKEY_INDEX and IB_QP_TIMEOUT to the attribute mask when
calling mlx5_set_path for the alternate path to force setting the
alt_pkey_index and alt_timeout values.

Fixes: bf24481a3a7c4 ('IB/mlx5: Consider alternate path in pkey ...')
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:50 -04:00
Noa Osherovich
d3ae2bdeba IB/mlx5: Fix pkey_index length in the QP path record
Pkey index fields in the QP context path record are extended to 16
bits, as required by IB spec (version 1.3).
This change affects all QP commands which include path records.

To enable this change, moved the free adaptive routing flag bit
(free_ar) to the most significant byte of the QP path record.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB ...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:50 -04:00
Noa Osherovich
3c4c37746c IB/mlx5: Fix entries check in mlx5_ib_resize_cq
Verify that number of entries is less than device capability.
Add an appropriate warning message for error flow.

Fixes: bde51583f4 ('IB/mlx5: Add support for resize CQ')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:50 -04:00
Noa Osherovich
9ea5785286 IB/mlx5: Fix entries checks in mlx5_ib_create_cq
Number of entries shouldn't be greater than the device's max
capability. This should be checked before rounding the entries number
to power of two.

Fixes: 51ee86a4af ('IB/mlx5: Fix check of number of entries...')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:50 -04:00
Noa Osherovich
2cc6ad5f21 IB/mlx5: Check BlueFlame HCA support
BlueFlame support is reported only for PFs when the HCA capability is
on.

Fixes: 938fe83c8d ('net/mlx5_core: New device capabilities...')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:50 -04:00
Noa Osherovich
0540d8148d IB/mlx5: Fix returned values of query QP
Some variables were not initialized properly: max_recv_wr,
max_recv_sge, max_send_wr, qp_context and max_inline_data.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:49 -04:00
Noa Osherovich
bc5c6eed05 IB/mlx5: Limit query HCA clock
When PAGE_SIZE is larger than 4K, the user shouldn't be able to query
the HCA core clock. This counter is within 4KB boundary and the
user-space shall not read information that's after this boundary.

Fixes: b368d7cb8c ('IB/mlx5: Add hca_core_clock_offset to...')
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:49 -04:00
Eran Ben Elisha
c0fcebf552 IB/mlx5: Fix FW version diaplay in sysfs
Add a 4-digit padding to show FW version in proper format.

Fixes: 9603b61de1 ('mlx5: Move pci device handling from...')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:49 -04:00
Noa Osherovich
2788cf3bd9 IB/mlx5: Return PORT_ERR in Active to Initializing tranisition
FW port-change events are fired on Active <-> non Active port state
transitions only.
When the port state changes from Active to Initializing (Active ->
Down -> Initializing), a single event is fired.
The HCA transitions from Down to Initializing unless prevented from
doing so, hence the driver should also propagate events when the port
state is Initializing to consumers so they'll be aware that the port
is no longer Active and act accordingly.

Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:49 -04:00
Maor Gottlieb
da6d6ba3c6 IB/mlx5: Set flow steering capability bit
Flow steering is supported by mlx5 device when the following
features are supported by firmware:

1. NIC RX flow table.
2. Device has enough flow steering levels.
3. Atomic modification of flow table entry.
4. Flow tables chaining.

To check if flow steering is supported it's enough to check
if the driver opened the mlx5 bypass namespace.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-07 10:03:49 -04:00
Linus Torvalds
76b584d312 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma updates from Doug Ledford:
 "Primary 4.7 merge window changes

   - Updates to the new Intel X722 iWARP driver
   - Updates to the hfi1 driver
   - Fixes for the iw_cxgb4 driver
   - Misc core fixes
   - Generic RDMA READ/WRITE API addition
   - SRP updates
   - Misc ipoib updates
   - Minor mlx5 updates"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (148 commits)
  IB/mlx5: Fire the CQ completion handler from tasklet
  net/mlx5_core: Use tasklet for user-space CQ completion events
  IB/core: Do not require CAP_NET_ADMIN for packet sniffing
  IB/mlx4: Fix unaligned access in send_reply_to_slave
  IB/mlx5: Report Scatter FCS device capability when supported
  IB/mlx5: Add Scatter FCS support for Raw Packet QP
  IB/core: Add Scatter FCS create flag
  IB/core: Add Raw Scatter FCS device capability
  IB/core: Add extended device capability flags
  i40iw: pass hw_stats by reference rather than by value
  i40iw: Remove unnecessary synchronize_irq() before free_irq()
  i40iw: constify i40iw_vf_cqp_ops structure
  IB/mlx5: Add UARs write-combining and non-cached mapping
  IB/mlx5: Allow mapping the free running counter on PROT_EXEC
  IB/mlx4: Use list_for_each_entry_safe
  IB/SA: Use correct free function
  IB/core: Fix a potential array overrun in CMA and SA agent
  IB/core: Remove unnecessary check in ibnl_rcv_msg
  IB/IWPM: Fix a potential skb leak
  RDMA/nes: replace custom print_hex_dump()
  ...
2016-05-20 14:35:07 -07:00
Matan Barak
c16d2750a0 IB/mlx5: Fire the CQ completion handler from tasklet
Previously, mlx5_ib_cq_comp was executed from interrupt context.
Under heavy load, this could cause the CPU core to be in an interrupt
context too long.
Instead of executing the handler from the interrupt context we
execute it from a much friendly tasklet context.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-18 10:45:49 -04:00
Doug Ledford
0651ec932a Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into k.o/for-4.7 2016-05-13 19:40:38 -04:00
Majd Dibbiny
cff5a0f3a3 IB/mlx5: Report Scatter FCS device capability when supported
Report Scatter FCS support when the Firmware supports as well.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:28 -04:00
Majd Dibbiny
358e42ea66 IB/mlx5: Add Scatter FCS support for Raw Packet QP
Enable Scatter FCS in the RQ context when the user passes
Scatter FCS create flag.

Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:28 -04:00
Guy Levi
37aa5c36aa IB/mlx5: Add UARs write-combining and non-cached mapping
By this patch, the user space library will be able to improve
performance using appropriate ringing DoorBell method according to the
memory type it asked for.

Currently only one mapping command is allowed for UARs:
MLX5_IB_MMAP_REGULAR_PAGE. Using this mapping, the kernel maps the
UARs to write-combining (WC) if the system supports it.
If the system is not supporting WC the UARs are mapped to
non-cached(NC). In this case the user space library can't tell which
mapping is applied.
This patch adds 2 new mapping commands: MLX5_IB_MMAP_WC_PAGE and
MLX5_IB_MMAP_NC_PAGE. For these commands the kernel maps exactly as
requested and fails if it can't.

Since there is no generic way to check if the requested memory region
can be mapped as WC, driver enables conclusive WC mapping only for
x86, PowerPC and ARM which support WC for the device's memory region.

Signed-off-by: Guy Levy <guyle@mellanox.com>
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:03 -04:00
Matan Barak
6cbac1e4cd IB/mlx5: Allow mapping the free running counter on PROT_EXEC
The current mlx5 code disallows mapping the free running counter of
mlx5 based hardwares when PROT_EXEC is set.
Although this behaviour is correct, Linux does add an implicit VM_EXEC
to the vm_flags if the READ_IMPLIES_EXEC bit is set in the process
personality. This happens for example if the process stack is
executable.

This causes libmlx5 to output a warning and prevents the user from
reading the free running clock.
Executing the init segment of the hardware isn't a security risk
(at least no more than executing a process own stack), so we just
prevent writes to there.

Fixes: d69e3bcf79 ('IB/mlx5: Mmap the HCA's core clock register to
		      user-space')
Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 19:40:03 -04:00
Bart Van Assche
9aa8b3217e IB/core: Enhance ib_map_mr_sg()
The SRP initiator allows to set max_sectors to a value that exceeds
the largest amount of data that can be mapped at once with an mlx4
HCA using fast registration and a page size of 4 KB. Hence modify
ib_map_mr_sg() such that it can map partial sg-elements. If an
sg-element has been mapped partially, let the caller know
which fraction has been mapped by adjusting *sg_offset.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:57 -04:00
Christoph Hellwig
ff2ba99365 IB/core: Add passing an offset into the SG to ib_map_mr_sg
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13 13:37:11 -04:00
David S. Miller
cba6532100 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	net/ipv4/ip_gre.c

Minor conflicts between tunnel bug fixes in net and
ipv6 tunnel cleanups in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-05-04 00:52:29 -04:00
Linus Torvalds
925d96a0c9 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
 "Final set of -rc fixes for 4.6.

  I've collected up a number of patches that are all pretty small with
  the exception of only a couple.  The hfi1 driver has a number of
  important patches, and it is what really drives the line count of this
  pull request up.  These are all small and I've got this kernel built
  and running in the test lab (I have most of the hardware, I think nes
  is the only thing in this patch set that I can't say I've personally
  tested and have up and running).

  Summary:

   - A number of collected fixes for oopses, memory corruptions,
     deadlocks, etc.  All of these fixes are small (many only 5-10
     lines), obvious, and tested.

   - Fix for the security issue related to the use of write for
     bi-directional communications"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  RDMA/nes: don't leak skb if carrier down
  IB/security: Restrict use of the write() interface
  IB/hfi1: Use kernel default llseek for ui device
  IB/hfi1: Don't attempt to free resources if initialization failed
  IB/hfi1: Fix missing lock/unlock in verbs drain callback
  IB/rdmavt: Fix send scheduling
  IB/hfi1: Prevent unpinning of wrong pages
  IB/hfi1: Fix deadlock caused by locking with wrong scope
  IB/hfi1: Prevent NULL pointer deferences in caching code
  MAINTAINERS: Update iser/isert maintainer contact info
  IB/mlx5: Expose correct max_sge_rd limit
  RDMA/iw_cxgb4: Fix bar2 virt addr calculation for T4 chips
  iw_cxgb4: handle draining an idle qp
  iw_cxgb3: initialize ibdev.iwcm->ifname for port mapping
  iw_cxgb4: initialize ibdev.iwcm->ifname for port mapping
  IB/core: Don't drain non-existent rq queue-pair
  IB/core: Fix oops in ib_cache_gid_set_default_gid
2016-04-29 17:07:54 -07:00
Maor Gottlieb
d63cd28608 net/mlx5: Add user chosen levels when allocating flow tables
Currently, consumers of the flow steering infrastructure can't
choose their own flow table levels and are limited to one
flow table per level. This just waste levels.
Instead, we introduce here the possibility to use multiple
flow tables in a level. The user is free to connect these
flow tables, while following the rule (FTEs in FT of level x
could only point to FTs of level y where y > x).

In addition this patch switch the order of the create/destroy
flow tables of the NIC(vlan and main).

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-29 16:29:09 -04:00
Doug Ledford
e29bff46b9 Merge branch 'k.o/for-4.6-rc' into testing/4.6 2016-04-28 15:16:32 -04:00
Sagi Grimberg
986ef95ecd IB/mlx5: Expose correct max_sge_rd limit
mlx5 devices (Connect-IB, ConnectX-4, ConnectX-4-LX) has a limitation
where rdma read work queue entries cannot exceed 512 bytes.
A rdma_read wqe needs to fit in 512 bytes:
- wqe control segment (16 bytes)
- rdma segment (16 bytes)
- scatter elements (16 bytes each)

So max_sge_rd should be: (512 - 16 - 16) / 16 = 30.

Cc: linux-stable@vger.kernel.org
Reported-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagig@grimberg.me>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-28 10:49:17 -04:00
Saeed Mahameed
046339eaab net/mlx5e: Device's mtu field is u16 and not int
For set/query MTU port firmware commands the MTU field
is 16 bits, here I changed all the "int mtu" parameters
of the functions wrapping those firmware commands to be u16.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24 14:51:38 -04:00
Arnd Bergmann
9967c70abc IB/mlx5: fix VFs callback function prototypes
The previous patch that added a couple of callback functions put
the declarations inside of an #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING,
which causes the build to fail if that option is disabled:

drivers/infiniband/hw/mlx5/main.c: In function 'mlx5_ib_add':
drivers/infiniband/hw/mlx5/main.c:2358:31: error: 'mlx5_ib_get_vf_config' undeclared (first use in this function)

This moves the four declarations below the #ifdef section so they
are always available.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: eff901d30e ("IB/mlx5: Implement callbacks for manipulating VFs")
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-04-06 10:37:07 -04:00