Applications must not assume that max_sge and max_sge_rd are the same,
Hence expose max_sge_rd correctly as well.
Reported-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Running checkpatch.pl on mad.c produces several
"ERROR: code indent should use tabs where possible" messages.
This patch fixes these.
Signed-off-by: Jeff Becker <Jeffrey.C.Becker@nasa.gov>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch adds the value of the CNP opcode to the existing list of enumerated
opcodes in ib_pack.h
Add common OPA header definitions for driver
build:
- opa_port_info.h
- opa_smi.h
- hfi1_user.h
Additionally, ib_mad.h, has additional definitions
that are common to ib_drivers including:
- trap support
- cca support
The qib driver has the duplication removed in favor
those in ib_mad.h
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: John, Jubin <jubin.john@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The HW hasn't been sold since 2005, and the SW has definite bit rot.
Its time to remove it. So move it to staging for a few releases and
then remove it after that.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
It is now time for the ipath driver to begin to be phased out of the kernel.
This patch moves the ipath driver from the Infiniband sub tree to the staging
area where it will remain until the code is removed from the kernel in a few
releases.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Create the rdma directory in the staging area for use as we deprecate
some older drivers and as we bring in some new drivers that are in
need of work. Update the MAINTAINERS file so that updates to these
files go to linux-rdma@vger.kernel.org. Expected lifespan of this
directory is three releases for any deprecated drivers moved here
and an unknown, but theoretically bounded amount of time for the new
drivers as a new core RDMA transfer library needs to be written and
the drivers modified to use it in order for them to move out of this
directory.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Resolving a link-local IPv6 address with an unspecified source address
was broken by commit 5462eddd7a, which prevented the IPv6 stack from
learning the scope id of the link-local IPv6 address, causing random
failures as the IP stack chose a random link to resolve the address on.
This commit 5462eddd7a made us bail out of cma_check_linklocal early if
the address passed in was not an IPv6 link-local address. On the address
resolution path, the address passed in is the source address; if the
source address is the unspecified address, which is not link-local, we
will bail out early.
This is mostly correct, but if the destination address is a link-local
address, then we will be following a link-local route, and we'll need to
tell the IPv6 stack what the scope id of the destination address is.
This used to be done by last line of cma_check_linklocal, which is
skipped when bailing out early:
dev_addr->bound_dev_if = sin6->sin6_scope_id;
(In cma_bind_addr, the sin6_scope_id of the source address is set to the
sin6_scope_id of the destination address, so this is correct)
This line is required in turn for the following line, L279 of
addr6_resolve, to actually inform the IPv6 stack of the scope id:
fl6.flowi6_oif = addr->bound_dev_if;
Since we can only know we are in this failure case when we have access
to both the source IPv6 address and destination IPv6 address, we have to
deal with this further up the stack. So detect this failure case in
cma_bind_addr, and set bound_dev_if to the destination address scope id
to correct it.
Signed-off-by: Spencer Baugh <sbaugh@catern.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Something like this:
CPU A CPU B
Acked-by: Sean Hefty <sean.hefty@intel.com>
======================== ================================
ucma_destroy_id()
wait_for_completion()
.. anything
ucma_put_ctx()
complete()
.. continues ...
ucma_leave_multicast()
mutex_lock(mut)
atomic_inc(ctx->ref)
mutex_unlock(mut)
ucma_free_ctx()
ucma_cleanup_multicast()
mutex_lock(mut)
kfree(mc)
rdma_leave_multicast(mc->ctx->cm_id,..
Fix it by latching the ref at 0. Once it goes to 0 mc and ctx cannot
leave the mutex(mut) protection.
The other atomic_inc in ucma_get_ctx is OK because mutex(mut) protects
it from racing with ucma_destroy_id.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Currently the sg tablesize, which dictates fast register page list
depth to use, does not take into account the limits of the rdma device.
So adjust it once we discover the device fastreg max depth limit. Also
adjust the max_sectors based on the resulting sg tablesize.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The only place that assigns mr inside the loop already does a break.
So "if (mr)" will never be true here since the function initializes mr
to NULL at the top. We can just drop the extra if and break here.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Acked-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The lkey table is allocated with with a get_user_pages() with an
order based on a number of index bits from a module parameter.
The underlying kernel code cannot allocate that many contiguous pages.
There is no reason the underlying memory needs to be physically
contiguous.
This patch:
- switches the allocation/deallocation to vmalloc/vfree
- caps the number of bits to 23 to insure at least 1 generation bit
o this matches the module parameter description
Cc: stable@vger.kernel.org
Reviewed-by: Vinit Agnihotri <vinit.abhay.agnihotri@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY
but does not set the the device local_dma_lkey. This breaks
rpcrdma drivers.
Query and set this lkey when creating the device resources.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The driver is a platform driver and not a I2C driver so its modalias
should be exported with MODULE_DEVICE_TABLE(platform,...) instead of
MODULE_DEVICE_TABLE(i2c,...).
Also, remove the unnecessary MODULE_ALIAS("platform:max8997-haptic")
now that the correct module alias is created.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
The function genpd_dev_pm_detach() detaches a device from a PM domain,
however, in the description, the "dev" argument for the function is
described as the device to "attach" instead of "detach". Correct this.
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The governor dummies for the !CONFIG_PM_GENERIC_DOMAINS case are
unusable, as a governors is always referred to by taking its address,
which you can't do with a literal NULL pointer.
I.e.
pm_genpd_init(genpd, &simple_qos_governor, false);
fails to compile with:
error: lvalue required as unary '&' operand
Hence just remove the governor dummies.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
"domain": header is indented by 4, data by 0 spaces => 0 spaces
"/device": header is indented by 11, data by 4 spaces => 4 spaces
"slaves": header is indented by 47, data by 49 spaces => 48 spaces
Ruler:
1234567890123456789012345678901234567890123456789012345678901234567890
Before:
domain status slaves
/device runtime status
----------------------------------------------------------------------
a3sp on a2us
/devices/platform/e60b0000.i2c suspended
After:
domain status slaves
/device runtime status
----------------------------------------------------------------------
a3sp on a2us
/devices/platform/e60b0000.i2c suspended
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Let rapl_unregister_powercap() disable the second power limit
only if it exists.
Intel64 SDM Vol.3 14.9 says that the package domain has it
but neither the power plane domain nor the DRAM domain has it.
Signed-off-by: Seiichi Ikarashi <s.ikarashi@jp.fujitsu.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
get_cpu_topology() tries to get topology info from all cpus by reading
files in the topology sysfs dir. If a cpu is offlined, since it doesn't
have topology dir, this function fails and returns -1. This causes
functions relying on get_cpu_topology() to fail. For example-
$ cpupower monitor
Cannot read number of available processors
Fix this by skipping fetching topology info for offline cpus.
Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reported-by: Pavaman Subramaniyam <pavsubra@linux.vnet.ibm.com>
Acked-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter/IPVS updates for your net-next tree.
In sum, patches to address fallout from the previous round plus updates from
the IPVS folks via Simon Horman, they are:
1) Add a new scheduler to IPVS: The weighted overflow scheduling algorithm
directs network connections to the server with the highest weight that is
currently available and overflows to the next when active connections exceed
the node's weight. From Raducu Deaconu.
2) Fix locking ordering in IPVS, always take rtnl_lock in first place. Patch
from Julian Anastasov.
3) Allow to indicate the MTU to the IPVS in-kernel state sync daemon. From
Julian Anastasov.
4) Enhance multicast configuration for the IPVS state sync daemon. Also from
Julian.
5) Resolve sparse warnings in the nf_dup modules.
6) Fix a linking problem when CONFIG_NF_DUP_IPV6 is not set.
7) Add ICMP codes 5 and 6 to IPv6 REJECT target, they are more informative
subsets of code 1. From Andreas Herz.
8) Revert the jumpstack size calculation from mark_source_chains due to chain
depth miscalculations, from Florian Westphal.
9) Calm down more sparse warning around the Netfilter tree, again from Florian
Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
support for '%s' in bpf_trace_printk
v2->v3:
fix the comment to mention that strncpy_from_unsafe() returns
the length of the string including the trailing NUL.
v1->v2:
patch 1: generalize FETCH_FUNC_NAME(memory, string) into
strncpy_from_unsafe()
patch 2: use it in bpf_trace_printk
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
generalize FETCH_FUNC_NAME(memory, string) into
strncpy_from_unsafe() and fix sparse warnings that were
present in original implementation.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure we catch future netpoll_send_udp users who use it without
disabling irqs and also as a hint for poll_controller users.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The convention in nfnetlink is to use network byte order in every header field
as well as in the attribute payload. The initial version of the batching
infrastructure assumes that res_id comes in host byte order though.
The only client of the batching infrastructure is nf_tables, so let's add a
workaround to address this inconsistency. We currently have 11 nfnetlink
subsystems according to NFNL_SUBSYS_COUNT, so we can assume that the subsystem
2560, ie. htons(10), will not be allocated anytime soon, so it can be an alias
of nf_tables from the nfnetlink batching path when interpreting the res_id
field.
Based on original patch from Florian Westphal.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
In continue to proposed Vinson Lee's post [1], this patch fixes compilation
issues founded at gcc 4.4.7. The initialization of .cidr field of unnamed
unions causes compilation error in gcc 4.4.x.
References
Visible links
[1] https://lkml.org/lkml/2015/7/5/74
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Sergei Shtylyov says:
====================
Some phylib simplifications
Here's 2 patches against DaveM's 'net-next.git' repo. We simplify a bogus
string of type casts in the 1st patch and make the code respect some coding
standards of the networking code in the 2nd one. I may follow with fixing of
checkpatch.pl's complaints. if I have time..
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix scripts/checkpatch.pl's messages like:
CHECK: Comparison to NULL could be written "!phydrv->read_mmd_indirect"
BTW, it doesn't detect the reversed comparisons (which I've fixed as well).
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just some minor noise follow-up to address some stylistic issues of
commit 3b3ae88026 ("net: sched: consolidate tc_classify{,_compat}").
Accidentally v1 instead of v2 of that commit got applied, so this
patch adds the relative diff.
Suggested-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, the driver would refuse to load if it couldn't secure
enough VIs from the MC to fulfill its RSS requirements.
This was causing probe to fail on later functions in
configurations where we'd run out of VIs, such as having many
VFs.
This change allows the driver to load with fewer VIs, down to a
minimum of 2. A warning will be printed saying that RSS
requirements were not met, possibly affecting performance.
efx->max_tx_channels needs to be set to avoid going down the
failure path in efx_probe_nic() immediately in the loop after the
probe() NIC-type function.
Also, Set rc=ENOSPC when bombing out of efx_probe_nic due to lack
of VIs.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Forcing uninitialized state allows us to upgrade and reinitialize
the adapter.
FW_VERSION_T4 = 1.4.0.0
FW_VERSION_T5 = 0.0.0.0
FW_VERSION_T6 = 0.0.0.0
At this point driver supports above and greater than above version.
If FW in adapter < min FW_VERSION driver supports tries to upgrade the FW
If FW in adapter >= FW_VERSION driver supports then it follows normal path
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antonio Quartulli says:
====================
Included changes:
- code beautification
- remove obsolete 'deleted' attribute for bat-gw node
- increase internal version number
- prevent potential access to netdev object after deregistration
- set needed_head/tail_room for batman virtual interface
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix typo in conntrack.c
s/CONFIG_NF_CONNTRACK_LABEL/CONFIG_NF_CONNTRACK_LABELS/
Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern says:
====================
net: Refactor inetpeer cache and add support for VRFs
Per Dave's comment on the version 1 patch adding VRF support to inetpeer
cache by explicitly making the address + index a key. Refactored the
inetpeer code in the process; mostly impacts the use by tcp_metrics.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
inetpeer caches based on address only, so duplicate IP addresses within
a namespace return the same cached entry. Enhance the ipv4 address key
to contain both the IPv4 address and VRF device index.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the inetpeer_addr_base union to inetpeer_addr and drop
inetpeer_addr_base.
Both the a6 and in6_addr overlays are not needed; drop the __be32 version
and rename in6 to a6 for consistency with ipv4. Add a new u32 array to
the union which removes the need for the typecast in the compare function
and the use of a consistent arg for both ipv4 and ipv6 addresses which
makes the compare function more readable.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_metrics and inetpeer both have functions to compare inetpeer
addresses. Consolidate into 1 version.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use inetpeer set,get helpers in tcp_metrics rather than peeking into
the inetpeer_addr struct.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The change to use a custom dst broke tcpdump captures on the VRF device:
$ tcpdump -n -i vrf10
...
05:32:29.009362 IP 10.2.1.254 > 10.2.1.2: ICMP echo request, id 21989, seq 1, length 64
05:32:29.009855 00:00:40:01:8d:36 > 45:00:00:54:d6:6f, ethertype Unknown (0x0a02), length 84:
0x0000: 0102 0a02 01fe 0000 9181 55e5 0001 bd11 ..........U.....
0x0010: da55 0000 0000 bb5d 0700 0000 0000 1011 .U.....]........
0x0020: 1213 1415 1617 1819 1a1b 1c1d 1e1f 2021 ...............!
0x0030: 2223 2425 2627 2829 2a2b 2c2d 2e2f 3031 "#$%&'()*+,-./01
0x0040: 3233 3435 3637 234567
Local packets going through the VRF device are missing an ethernet header.
Fix by adding one and then stripping it off before pushing back to the IP
stack. With this patch you get the expected dumps:
...
05:36:15.713944 IP 10.2.1.254 > 10.2.1.2: ICMP echo request, id 23795, seq 1, length 64
05:36:15.714160 IP 10.2.1.2 > 10.2.1.254: ICMP echo reply, id 23795, seq 1, length 64
...
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
reserved for the use of routing protocols and other low-level topology
discovery or maintenance protocols, such as gateway discovery and
group membership reporting. Multicast routers should not forward any
multicast datagram with destination addresses in this range,
regardless of its TTL.
Currently, IGMP reports are generated for this reserved range of
addresses even though a router will ignore this information since it
has no purpose. However, the presence of reserved group addresses in
an IGMP membership report uses up network bandwidth and can also
obscure addresses of interest when inspecting membership reports using
packet inspection or debug messages.
Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
v3) do not specify that the reserved addresses be excluded from
membership reports, it should do no harm in doing so. In particular
there should be no adverse effect in any IGMP snooping functionality
since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
Snooping Switches Considerations) section 2.1.2. Data Forwarding
Rules:
2) Packets with a destination IP (DIP) address in the 224.0.0.X
range which are not IGMP must be forwarded on all ports.
IGMP reports for local multicast groups can now be optionally
inhibited by means of a system control variable (by setting the value
to zero) e.g.:
echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
To retain backwards compatibility the previous behaviour is retained
by default on system boot or reverted by setting the value back to
non-zero e.g.:
echo 1 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports
Signed-off-by: Philip Downey <pdowney@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>