Add services for the allocating, freeing, and unmapping Fast Reg MR. These
services will be used by the transport connection setup, send and receive
routines.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Jay Cliburn noticed and diagnosed a bug triggered in
dev_gso_skb_destructor() after last change from qdisc->gso_skb
to qdisc->requeue list. Since gso_segmented skbs can't be queued
to another list this patch brings back qdisc->gso_skb for them.
Reported-by: Jay Cliburn <jcliburn@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provides implementation of the enhancements of XFRM/PF_KEY MIGRATE mechanism
specified in draft-ebalard-mext-pfkey-enhanced-migrate-00. Defines associated
PF_KEY SADB_X_EXT_KMADDRESS extension and XFRM/netlink XFRMA_KMADDRESS
attribute.
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This protocol provides some connection handling and negotiated
congestion control. Nokia cellular modems use it for bulk transfers.
It provides packet boundaries (hence SOCK_SEQPACKET). Congestion
control is per packet rather per byte, so we do not re-use the
generic socket memory accounting.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clean up: Now that lockd_up() starts listeners for both transports, the
"proto" argument is no longer needed.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Since v.2.6.23 DEF_INITSEG and DEF_SETUPSEG are unused. Commit c397368
("Remove old i386 setup code") dropped their usage for i386. They did not
return in the x86 tree. (Something similar must have happened for x86_64.)
Remove these.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Any block based fs (this patch includes ext3) just has to declare its own
fiemap() function and then call this generic function with its own
get_block_t. This works well for block based filesystems that will map
multiple contiguous blocks at one time, but will work for filesystems that
only map one block at a time, you will just end up with an "extent" for each
block. One gotcha is this will not play nicely where there is hole+data
after the EOF. This function will assume its hit the end of the data as soon
as it hits a hole after the EOF, so if there is any data past that it will
not pick that up. AFAIK no block based fs does this anyway, but its in the
comments of the function anyway just in case.
Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org
Basic vfs-level fiemap infrastructure, which sets up a new ->fiemap
inode operation.
Userspace can get extent information on a file via fiemap ioctl. As input,
the fiemap ioctl takes a struct fiemap which includes an array of struct
fiemap_extent (fm_extents). Size of the extent array is passed as
fm_extent_count and number of extents returned will be written into
fm_mapped_extents. Offset and length fields on the fiemap structure
(fm_start, fm_length) describe a logical range which will be searched for
extents. All extents returned will at least partially contain this range.
The actual extent offsets and ranges returned will be unmodified from their
offset and range on-disk.
The fiemap ioctl returns '0' on success. On error, -1 is returned and errno
is set. If errno is equal to EBADR, then fm_flags will contain those flags
which were passed in which the kernel did not understand. On all other
errors, the contents of fm_extents is undefined.
As fiemap evolved, there have been many authors of the vfs patch. As far as
I can tell, the list includes:
Kalpak Shah <kalpak.shah@sun.com>
Andreas Dilger <adilger@sun.com>
Eric Sandeen <sandeen@redhat.com>
Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: linux-api@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
The nlm_reboot structure is used to store information provided by the
NSM_NOTIFY procedure. This procedure is not specified by the NLM or NSM
protocols, other than to say that the procedure can be used to transmit
information private to a particular NLM/NSM implementation.
For Linux, the callback arguments include the name of the monitored host,
the new NSM state of the host, and a 16-byte private opaque.
As a clean up, remove the unused fields and the server-side XDR logic that
decodes them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
lockd accepts SM_NOTIFY calls only from a privileged process on the
local system. If lockd uses an AF_INET6 listener, the sender's address
(ie the local rpc.statd) will be the IPv6 loopback address, not the
IPv4 loopback address.
Make sure the privilege test in nlmsvc_proc_sm_notify() and
nlm4svc_proc_sm_notify() works for both AF_INET and AF_INET6 family
addresses by refactoring the test into a helper and adding support for
IPv6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Adjust the signature and callers of nlmclnt_grant() to pass a "struct
sockaddr *" instead of a "struct sockaddr_in *" in order to support IPv6
addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Fix up nlmsvc_lookup_host() to pass AF_INET6 source addresses to
nlm_lookup_host().
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Pass a struct sockaddr * and a length to nlmclnt_lookup_host() to
accomodate non-AF_INET family addresses.
As a side benefit, eliminate the hostname_len argument, as the hostname
is always NUL-terminated.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Add data types to track Fast Reg Memory Regions. The core data type is
svc_rdma_fastreg_mr that associates a device MR with a host kva and page
list. A field is added to the WR context to keep track of the FRMR
used to map the local memory for an RPC.
An FRMR list and spin lock are added to the transport instance to keep
track of all FRMR allocated for the transport. Also added are device
capability flags to indicate what the memory registration
capabilities are for the underlying device and whether or not fast
memory registration is supported.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Do all the grace period checks in svclock.c. This simplifies the code a
bit, and will ease some later changes.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Rewrite grace period code to unify management of grace period across
lockd and nfsd. The current code has lockd and nfsd cooperate to
compute a grace period which is satisfactory to them both, and then
individually enforce it. This creates a slight race condition, since
the enforcement is not coordinated. It's also more complicated than
necessary.
Here instead we have lockd and nfsd each inform common code when they
enter the grace period, and when they're ready to leave the grace
period, and allow normal locking only after both of them are ready to
leave.
We also expect the locks_start_grace()/locks_end_grace() interface here
to be simpler to build on for future cluster/high-availability work,
which may require (for example) putting individual filesystems into
grace, or enforcing grace periods across multiple cluster nodes.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Rework of SMTC support to make it work with the new clock event system,
allowing "tickless" operation, and to make it compatible with the use of
the "wait_irqoff" idle loop. The new clocking scheme means that the
previously optional IPI instant replay mechanism is now required, and has
been made more robust.
Signed-off-by: Kevin D. Kissell <kevink@paralogos.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Though from a hardware perspective it would be sensible to use only a
32-bit unsigned int type Linux defines interrupt flags to be stored in
an unsigned long and nothing else.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
There's already a fc_vport_termintate() call exported by
the transport. This patch adds a symmetric call to the API to allow
an NPIV-capable LLD to instantiate vports sans user intervention.
Additional comments/updates:
Re: scsi_fc_transport.txt
Add a function prototype for fc_vport_terminate similar to what's
done for fc_vport_create
Re: fc_vport_create
I recommend we pass the channel number in fc_vport_create rather
than fixing it at zero.
Also, ids->vport_type should be set to FC_PORTTYPE_NPIV prior to
calling fc_vport_create. The comment is also meaningless.
Added-by and
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds the ability to print sizes in either units of 10^3 (SI)
or 2^10 (Binary) units. It rounds up to three significant figures and
can be used for either memory or storage capacities.
Oh, and I'm fully aware that 64 bits is only 16EiB ... the Zetta and
Yotta units are added for future proofing against the day we have 128
bit computers ...
[fujita.tomonori@lab.ntt.co.jp: fix missed unsigned long long cast]
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Brian King <brking@linux.vnet.ibm.com> reported that fibre channel
devices can oops during scanning if their ports block (because the
device goes from CREATED -> BLOCK -> RUNNING rather than CREATED ->
BLOCK -> CREATED).
Fix this by adding a new state: CREATED_BLOCK which can only transition
back to CREATED and disallow the CREATED -> BLOCK transition. Now both
the created and blocked states that the mid-layer recognises can include
CREATED_BLOCK.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The created and blocked states are very shortly going to correspond to
mixed sdev_state states.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds scsi netlink recieve and event support for transport
and scsi LLDD's. It is a reimplementation of the patch posted last
week by David Somayajulu.
http://marc.info/?l=linux-scsi&m=121745486221819&w=2
There are a few things done differently:
- Transport support is included
- Event delivery is included
- The vendor message is now its own unique message type, considered
part of the generic "SCSI Transport".
- LLDD entry points are now registered rather than included in the
scsi_host_template.
Background: When I started to implement the event handler via template,
I had to either: muck up scsi_add_host and scsi_remove_host; or have
the event handler search all possible shosts. Neither was acceptable.
Moving to a registration solves this, and also limits the scope of
the changes to something that could be backported to a distro without
breaking an already-released-distro kabi. However, I admit it isn't
as elegant, as the passing of the LLDD host template in the
registration and the complexity around dynamic add/remove shows.
- The receive path was augmented to require a unique identifier for
the LLDD before the message was allowed to be handed off to the
driver. Given how quickly very fatal errors occur if there's msg
mismatches (which I saw in testing my own tools :), I believe this
to be a very good thing. The id plays off the vendor id scheme already
introduced for the vendor unique event messages used by FC.
Additionally, the id use as the basis of the registration/deregistration.
- Send assist functions, for both the transport and LLDDs are included.
[fujita.tomonori@lab.ntt.co.jp: fix missing cast]
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds stalled-CPU detection to Classic RCU. This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which
defaults disabled.
This is a debugging feature to detect infinite loops in kernel code, not
something that non-kernel-hackers would be expected to care about.
This feature can detect looping CPUs in !PREEMPT builds and looping CPUs
with preemption disabled in PREEMPT builds. This is essentially a port of
this functionality from the treercu patch, replacing the stall debug patch
that is already in tip/core/rcu (commit 67182ae1c4).
The changes from the patch in tip/core/rcu include making the config
variable name match that in treercu, changing from seconds to jiffies to
avoid spurious warnings, and printing a boot message when this feature
is enabled.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The balloon driver doesn't have any externally callable functions at
the moment, so remove the (effectively empty) header. We can add it
back if we need to.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The previous patch db203d53d4 ("mm:
tiny-shmem fix lock ordering: mmap_sem vs i_mutex") to fix the lock
ordering in tiny-shmem breaks shared anonymous and IPC memory on NOMMU
architectures because it was using the expanding truncate to signal ramfs
to allocate a physically contiguous RAM backing the inode (otherwise it is
unusable for "memory mapping" it to userspace).
However do_truncate is what caused the lock ordering error, due to it
taking i_mutex. In this case, we can actually just call ramfs directly to
allocate memory for the mapping, rather than go via truncate.
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sctp_is_any() function that is used to check for wildcard addresses
only looks at the address itself to determine the address family.
This function is used in the API to check the address passed in from
the user. If the user simply zerroes out the sockaddr_storage and
pass that in, we'll end up failing. So, let's try harder to determine
the address family by also checking the socket if it's possible.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
sctp_chunks should be put on a diet. This is some of the low hanging
fruit that we can strip out. Changes all the __s8/__u8 flags to
bitfields. Saves 12 bytes per chunk.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
The iptables tproxy code has to be able to do UDP socket hash lookups,
so we have to provide an exported lookup function for this purpose.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current TCP code relies on the local port of the listening socket
being the same as the destination address of the incoming
connection. Port redirection used by many transparent proxying
techniques obviously breaks this, so we have to store the original
destination port address.
This patch extends struct inet_request_sock and stores the incoming
destination port value there. It also modifies the handshake code to
use that value as the source port when sending reply packets.
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>