The xfs_attr3_root_inactive() call from xfs_attr_inactive() assumes that
attribute blocks exist to invalidate. It is possible to have an
attribute fork without extents, however. Consider the case where the
attribute fork is created towards the beginning of xfs_attr_set() but
some part of the subsequent attribute set fails.
If an inode in such a state hits xfs_attr_inactive(), it eventually
calls xfs_dabuf_map() and possibly xfs_bmapi_read(). The former emits a
filesystem corruption warning, returns an error that bubbles back up to
xfs_attr_inactive(), and leads to destruction of the in-core attribute
fork without an on-disk reset. If the inode happens to make it back
through xfs_inactive() in this state (e.g., via a concurrent bulkstat
that cycles the inode from the reclaim state and releases it), i_afp
might not exist when xfs_bmapi_read() is called and causes a NULL
dereference panic.
A '-p 2' fsstress run to ENOSPC on a relatively small fs (1GB)
reproduces these problems. The behavior is a regression caused by:
6dfe5a0 xfs: xfs_attr_inactive leaves inconsistent attr fork state behind
... which removed logic that avoided the attribute extent truncate when
no extents exist. Restore this logic to ensure the attribute fork is
destroyed and reset correctly if it exists without any allocated
extents.
cc: stable@vger.kernel.org # 3.12 to 4.0.x
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Pull locking updates from Ingo Molnar:
"The main changes are:
- 'qspinlock' support, enabled on x86: queued spinlocks - these are
now the spinlock variant used by x86 as they outperform ticket
spinlocks in every category. (Waiman Long)
- 'pvqspinlock' support on x86: paravirtualized variant of queued
spinlocks. (Waiman Long, Peter Zijlstra)
- 'qrwlock' support, enabled on x86: queued rwlocks. Similar to
queued spinlocks, they are now the variant used by x86:
CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
CONFIG_QUEUED_SPINLOCKS=y
CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
CONFIG_QUEUED_RWLOCKS=y
- various lockdep fixlets
- various locking primitives cleanups, further WRITE_ONCE()
propagation"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
locking/lockdep: Remove hard coded array size dependency
locking/qrwlock: Don't contend with readers when setting _QW_WAITING
lockdep: Do not break user-visible string
locking/arch: Rename set_mb() to smp_store_mb()
locking/arch: Add WRITE_ONCE() to set_mb()
rtmutex: Warn if trylock is called from hard/softirq context
arch: Remove __ARCH_HAVE_CMPXCHG
locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG
locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
locking/pvqspinlock, x86: Enable PV qspinlock for Xen
locking/pvqspinlock, x86: Enable PV qspinlock for KVM
locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
locking/pvqspinlock: Implement simple paravirt support for the qspinlock
locking/qspinlock: Revert to test-and-set on hypervisors
locking/qspinlock: Use a simple write to grab the lock
locking/qspinlock: Optimize for smaller NR_CPUS
locking/qspinlock: Extract out code snippets for the next patch
locking/qspinlock: Add pending bit
...
Pull vfs updates from Al Viro:
"In this pile: pathname resolution rewrite.
- recursion in link_path_walk() is gone.
- nesting limits on symlinks are gone (the only limit remaining is
that the total amount of symlinks is no more than 40, no matter how
nested).
- "fast" (inline) symlinks are handled without leaving rcuwalk mode.
- stack footprint (independent of the nesting) is below kilobyte now,
about on par with what it used to be with one level of nested
symlinks and ~2.8 times lower than it used to be in the worst case.
- struct nameidata is entirely private to fs/namei.c now (not even
opaque pointers are being passed around).
- ->follow_link() and ->put_link() calling conventions had been
changed; all in-tree filesystems converted, out-of-tree should be
able to follow reasonably easily.
For out-of-tree conversions, see Documentation/filesystems/porting
for details (and in-tree filesystems for examples of conversion).
That has sat in -next since mid-May, seems to survive all testing
without regressions and merges clean with v4.1"
* 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (131 commits)
turn user_{path_at,path,lpath,path_dir}() into static inlines
namei: move saved_nd pointer into struct nameidata
inline user_path_create()
inline user_path_parent()
namei: trim do_last() arguments
namei: stash dfd and name into nameidata
namei: fold path_cleanup() into terminate_walk()
namei: saner calling conventions for filename_parentat()
namei: saner calling conventions for filename_create()
namei: shift nameidata down into filename_parentat()
namei: make filename_lookup() reject ERR_PTR() passed as name
namei: shift nameidata inside filename_lookup()
namei: move putname() call into filename_lookup()
namei: pass the struct path to store the result down into path_lookupat()
namei: uninline set_root{,_rcu}()
namei: be careful with mountpoint crossings in follow_dotdot_rcu()
Documentation: remove outdated information from automount-support.txt
get rid of assorted nameidata-related debris
lustre: kill unused helper
lustre: kill unused macro (LOOKUP_CONTINUE)
...
Remove the hack where we fput the read-specific file in generic code.
Instead we can do it in nfsd4_encode_read as that gets called for all
error cases as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch changes nfs4_preprocess_stateid_op so it always returns
a valid struct file if it has been asked for that. For that we
now allocate a temporary struct file for special stateids, and check
permissions if we got the file structure from the stateid. This
ensures that all callers will get their handling of special stateids
right, and avoids code duplication.
There is a little wart in here because the read code needs to know
if we allocated a file structure so that it can copy around the
read-ahead parameters. In the long run we should probably aim to
cache full file structures used with special stateids instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
* bugfixes:
NFS: Ensure we set NFS_CONTEXT_RESEND_WRITES when requeuing writes
pNFS: Fix a memory leak when attempted pnfs fails
NFS: Ensure that we update the sequence id under the slot table lock
nfs: Initialize cb_sequenceres information before validate_seqid()
nfs: Only update callback sequnce id when CB_SEQUENCE success
NFSv4: nfs4_handle_delegation_recall_error should ignore EAGAIN
If jffs2 can deadlock on overlayfs readdir because it takes the same lock
on ->iterate() as in ->lookup().
Fix by moving whiteout checking outside iterate_dir(). Optimized by
collecting potential whiteouts (DT_CHR) in a temporary list and if
non-empty iterating throug these and checking for a 0/0 chardev.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Fixes: 49c21e1cac ("ovl: check whiteout while reading directory")
Reported-by: Roman Yeryomin <leroi.lists@gmail.com>
Allow filesystems with .d_revalidate as lower layer(s), but not as upper
layer.
For local filesystems the rule was that modifications on the layers
directly while being part of the overlay results in undefined behavior.
This can easily be extended to distributed filesystems: we assume the tree
used as lower layer is static, which means ->d_revalidate() should always
return "1". If that is not the case, return -ESTALE, don't try to work
around the modification.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
NFS and other distributed filesystems may place automount points in the
tree. Previoulsy overlayfs refused to mount such filesystems types (based
on the existence of the .d_automount callback), even if the actual export
didn't have any automount points.
It cannot be determined in advance whether the filesystem has automount
points or not. The solution is to allow fs with .d_automount but refuse to
traverse any automount points encountered.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
At LSF we decided that if we truncate up from isize we shouldn't trim
fallocated blocks that were fallocated with KEEP_SIZE and are past the
new i_size. This patch fixes ext4 to do this.
[ Completely reworked patch so that i_disksize would actually get set
when truncating up. Also reworked the code for handling truncate so
that it's easier to handle. -- tytso ]
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Make the error reporting behavior resulting from the unsupported use
of online defrag on files with data journaling enabled consistent with
that implemented for bigalloc file systems. Difference found with
ext4/308.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Remove outdated comments and dead code from ext4_da_reserve_space.
Clean up its trace point, and relocate it to make it more useful.
While we're at it, fix a nearby conditional used to determine if
we have a non-bigalloc file system. It doesn't match usage elsewhere
in the code, and misleadingly suggests that an s_cluster_ratio value
of 0 would be legal.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
ext4 isn't willing to map clusters to a non-extent file. Don't signal
this with an out of space error, since the FS will retry the
allocation (which didn't fail) forever. Instead, return EUCLEAN so
that the operation will fail immediately all the way back to userspace.
(The fix is either to run e2fsck -E bmap2extent, or to chattr +e the file.)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
We no longer calculate the minimum freelist size from the on-disk
AGF, so we don't need the macros used for this. That means the
nested macros can be cleaned up, and turn this into an actual
function so the logic is clear and concise. This will make it much
easier to add support for the rmap btree when the time comes.
This also gets rid of the XFS_AG_MAXLEVELS macro used by these
freelist macros as it is simply a wrapper around a single variable.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
The error handling is currently an inconsistent mess as every error
condition handles return values and releasing buffers individually.
Clean this up by using gotos and a sane error label stack.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
The longest extent length checks in xfs_alloc_fix_freelist() are now
essentially identical. Factor them out into a helper function, so we
know they are checking exactly the same thing before and after we
lock the AGF.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
At the moment, xfs_alloc_fix_freelist() uses a mix of per-ag based
access and agf buffer based access to freelist and space usage
information. However, once the AGF buffer is locked inside this
function, it is guaranteed that both the in-memory and on-disk
values are identical. xfs_alloc_fix_freelist() doesn't modify the
values in the structures directly, so it is a read-only user of the
infomration, and hence can use the per-ag structure exclusively for
determining what it should do.
This opens up an avenue for cleaning up a lot of duplicated logic
whose only difference is the structure it gets the data from, and in
doing so removes a lot of needless byte swapping overhead when
fixing up the free list.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Just use char pointers directly instead of the confusing typedef to a
pointer type.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Compared to char pointers this saves us a lot of casting effort. Also
add another local variable to make the code easier to read.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
This avoids all kinds of unessecary casts in an envrionment like Linux where
we can assume that pointer arithmetics are support on void pointers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
We can simply use a void pointer to pass a long return addresses in the
debugging helpers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Replace uses of __psint_t with the proper uintptr_t and ptrdiff_t types,
and remove the defintions of __psint_t and __psunsigned_t.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
If we create a CRC filesystem, mount it, and create a symlink with
a path long enough that it can't live in the inode, we get a very
strange result upon remount:
# ls -l mnt
total 4
lrwxrwxrwx. 1 root root 929 Jun 15 16:58 link -> XSLM
XSLM is the V5 symlink block header magic (which happens to be
followed by a NUL, so the string looks terminated).
xfs_readlink_bmap() advanced cur_chunk by the size of the header
for CRC filesystems, but never actually used that pointer; it
kept reading from bp->b_addr, which is the start of the block,
rather than the start of the symlink data after the header.
Looks like this problem goes back to v3.10.
Fixing this gets us reading the proper link target, again.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
In order to prevent quota block tracking to be inaccurate when
ext4_quota_write() fails with ENOSPC, we make two changes. The quota
file can now use the reserved block (since the quota file is arguably
file system metadata), and ext4_quota_write() now uses
ext4_should_retry_alloc() to retry the block allocation after a commit
has completed and released some blocks for allocation.
This fixes failures of xfstests generic/270:
Quota error (device vdc): write_blk: dquota write failed
Quota error (device vdc): qtree_write_dquot: Error -28 occurred while creating quota
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Normally all of the buffers will have been forced out to disk before
we call invalidate_bdev(), but there will be some cases, where a file
system operation was aborted due to an ext4_error(), where there may
still be some dirty buffers in the buffer cache for the device. So
try to force them out to memory before calling invalidate_bdev().
This fixes a warning triggered by generic/081:
WARNING: CPU: 1 PID: 3473 at /usr/projects/linux/ext4/fs/block_dev.c:56 __blkdev_put+0xb5/0x16f()
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
It is often the case that we mark buffer as having dirty metadata when
the buffer is already in that state (frequent for bitmaps, inode table
blocks, superblock). Thus it is unnecessary to contend on grabbing
journal head reference and bh_state lock. Avoid that by checking whether
any modification to the buffer is needed before grabbing any locks or
references.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Split out two self contained helpers to make the function more readable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Refactor the raparam hash helpers to just deal with the raparms,
and keep opening/closing files separate from that.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Use kernel.h macro definition.
Thanks to Julia Lawall for Coccinelle scripting support.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
This patch allows the block allocation code to retain the buffers
for the resource groups so they don't need to be re-read from buffer
cache with every request. This is a performance improvement that's
especially noticeable when resource groups are very large. For
example, with 2GB resource groups and 4K blocks, there can be 33
blocks for every resource group. This patch allows those 33 buffers
to be kept around and not read in and thrown away with every
operation. The buffers are released when the resource group is
either synced or invalidated.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Benjamin Marzinski <bmarzins@redhat.com>
This patch will add support to show the replacing target in sysfs
during the process of replacement.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.cz>
When btrfs on a device is overwritten with a new btrfs (mkfs),
the old btrfs instance in the kernel becomes stale. So with this
patch, if kernel finds device is overwritten then delete the stale
fsid/uuid.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Make file->f_path always point to the overlay dentry so that the path in
/proc/pid/fd is correct and to ensure that label-based LSMs have access to the
overlay as well as the underlay (path-based LSMs probably don't need it).
Using my union testsuite to set things up, before the patch I see:
[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
...
lr-x------. 1 root root 64 Jun 5 14:38 5 -> /a/foo107
[root@andromeda union-testsuite]# stat /mnt/a/foo107
...
Device: 23h/35d Inode: 13381 Links: 1
...
[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
...
Device: 23h/35d Inode: 13381 Links: 1
...
After the patch:
[root@andromeda union-testsuite]# bash 5</mnt/a/foo107
[root@andromeda union-testsuite]# ls -l /proc/$$/fd/
...
lr-x------. 1 root root 64 Jun 5 14:22 5 -> /mnt/a/foo107
[root@andromeda union-testsuite]# stat /mnt/a/foo107
...
Device: 23h/35d Inode: 40346 Links: 1
...
[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
...
Device: 23h/35d Inode: 40346 Links: 1
...
Note the change in where /proc/$$/fd/5 points to in the ls command. It was
pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
(which is correct).
The inode accessed, however, is the lower layer. The union layer is on device
25h/37d and the upper layer on 24h/36d.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Call ovl_drop_write() earlier in ovl_dentry_open() before we call vfs_open()
as we've done the copy up for which we needed the freeze-write lock by that
point.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Move kernfs_get_inode() prototype from fs/kernfs/kernfs-internal.h to
include/linux/kernfs.h. It obtains the matching inode for a
kernfs_node.
It will be used by cgroup for inode based permission checks for now
but is generally useful.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The glocks used for resource groups often come and go hundreds of
thousands of times per second. Adding them to the lru list just
adds unnecessary contention for the lru_lock spin_lock, especially
considering we're almost certainly going to re-use the glock and
take it back off the lru microseconds later. We never want the
glock shrinker to cull them anyway. This patch adds a new bit in
the glops that determines which glock types get put onto the lru
list and which ones don't.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
If a write attempt fails, and the write is queued up for resending to
the server, as opposed to being dropped, then we need to set the
appropriate flag so that nfs_file_fsync() does the right thing.
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
pnfs_do_write() expects the call to pnfs_write_through_mds() to free the
pgio header and to release the layout segment before exiting. The problem
is that nfs_pgio_data_destroy() doesn't actually do this; it only frees
the memory allocated by nfs_generic_pgio().
Ditto for pnfs_do_read()...
Fix in both cases is to add a call to hdr->release(hdr).
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
FS_CGROUP_WRITEBACK indicates whether a file_system_type supports
cgroup writeback; however, different super_blocks of the same
file_system_type may or may not support cgroup writeback depending on
filesystem options. This patch replaces FS_CGROUP_WRITEBACK with a
per-super_block flag.
super_block->s_flags carries some internal flags in the high bits but
it's exposd to userland through uapi header and running out of space
anyway. This patch adds a new field super_block->s_iflags to carry
kernel-internal flags. It is currently only used by the new
SB_I_CGROUPWB flag whose concatenated and abbreviated name is for
consistency with other super_block flags.
ext2_fill_super() is updated to set SB_I_CGROUPWB.
v2: Added super_block->s_iflags instead of stealing another high bit
from sb->s_flags as suggested by Christoph and Jan.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-ext4@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Currently, even when a filesystem doesn't set the FS_CGROUP_WRITEBACK
flag, if the filesystem uses wbc_init_bio() and wbc_account_io(), the
foreign inode detection and migration logic still ends up activating
cgroup writeback which is unexpected. This patch ensures that the
foreign inode detection logic stays disabled when inode_cgwb_enabled()
is false by not associating writeback_control's with bdi_writeback's.
This also avoids unnecessary operations in wbc_init_bio(),
wbc_account_io() and wbc_detach_inode() for filesystems which don't
support cgroup writeback.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add last missing line in commit "cdd9eefdf905"
("fs/ufs: restore s_lock mutex")
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The INOTIFY_USER option is bool, and hence this code is either
present or absent. It will never be modular, so using
module_init as an alias for __initcall is rather misleading.
Fix this up now, so that we can relocate module_init from
init.h into module.h in the future. If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.
Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups. As __initcall gets
mapped onto device_initcall, our use of fs_initcall (which
makes sense for fs code) will thus change this registration
from level 6-device to level 5-fs (i.e. slightly earlier).
However no observable impact of that small difference has
been observed during testing, or is expected.
Cc: John McCutchan <john@johnmccutchan.com>
Cc: Robert Love <rlove@rlove.org>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
NFS: NFSoRDMA Client Changes
These patches continue to build up for improving the rsize and wsize that the
NFS client uses when talking over RDMA. In addition, these patches also add
in scalability enhancements and other bugfixes.
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
* tag 'nfs-rdma-for-4.2' of git://git.linux-nfs.org/projects/anna/nfs-rdma: (142 commits)
xprtrdma: Reduce per-transport MR allocation
xprtrdma: Stack relief in fmr_op_map()
xprtrdma: Split rb_lock
xprtrdma: Remove rpcrdma_ia::ri_memreg_strategy
xprtrdma: Remove ->ro_reset
xprtrdma: Remove unused LOCAL_INV recovery logic
xprtrdma: Acquire MRs in rpcrdma_register_external()
xprtrdma: Introduce an FRMR recovery workqueue
xprtrdma: Acquire FMRs in rpcrdma_fmr_register_external()
xprtrdma: Introduce helpers for allocating MWs
xprtrdma: Use ib_device pointer safely
xprtrdma: Remove rr_func
xprtrdma: Replace rpcrdma_rep::rr_buffer with rr_rxprt
xprtrdma: Warn when there are orphaned IB objects
...