Commit Graph

49262 Commits

Author SHA1 Message Date
Eric Biggers
8048123576 fscrypto: remove unneeded Kconfig dependencies
SHA256 and ENCRYPTED_KEYS are not needed.  CTR shouldn't be needed
either, but I left it for now because it was intentionally added by
commit 71dea01ea2 ("ext4 crypto: require CONFIG_CRYPTO_CTR if ext4
encryption is enabled").  So it sounds like there may be a dependency
problem elsewhere, which I have not been able to identify specifically,
that must be solved before CTR can be removed.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:26:07 -05:00
Sergey Karamov
73b92a2a5e ext4: do not perform data journaling when data is encrypted
Currently data journalling is incompatible with encryption: enabling both
at the same time has never been supported by design, and would result in
unpredictable behavior. However, users are not precluded from turning on
both features simultaneously. This change programmatically replaces data
journaling for encrypted regular files with ordered data journaling mode.

Background:
Journaling encrypted data has not been supported because it operates on
buffer heads of the page in the page cache. Namely, when the commit
happens, which could be up to five seconds after caching, the commit
thread uses the buffer heads attached to the page to copy the contents of
the page to the journal. With encryption, it would have been required to
keep the bounce buffer with ciphertext for up to the aforementioned five
seconds, since the page cache can only hold plaintext and could not be
used for journaling. Alternatively, it would be required to setup the
journal to initiate a callback at the commit time to perform deferred
encryption - in this case, not only would the data have to be written
twice, but it would also have to be encrypted twice. This level of
complexity was not justified for a mode that in practice is very rarely
used because of the overhead from the data journalling.

Solution:
If data=journaled has been set as a mount option for a filesystem, or if
journaling is enabled on a regular file, do not perform journaling if the
file is also encrypted, instead fall back to the data=ordered mode for the
file.

Rationale:
The intent is to allow seamless and proper filesystem operation when
journaling and encryption have both been enabled, and have these two
conflicting features gracefully resolved by the filesystem.

Fixes: 4461471107
Signed-off-by: Sergey Karamov <skaramov@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2016-12-10 17:54:58 -05:00
David S. Miller
821781a9f4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-10 16:21:55 -05:00
Darrick J. Wong
29ac8e856c ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features
Connect the new VFS clone_range, copy_range, and dedupe_range features
to the existing reflink capability of ocfs2.  Compared to the existing
ocfs2 reflink ioctl We have to do things a little differently to support
the VFS semantics (we can clone subranges of a file but we don't clone
xattrs), but the VFS ioctls are more broadly supported.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: Convert inline data files to extents files before reflinking,
and fix i_blocks so that stat(2) output is correct.
v3: Make zero-length dedupe consistent with btrfs behavior.
v4: Use VFS double-inode lock routines and remove MAX_DEDUPE_LEN.
2016-12-10 12:39:45 -08:00
Darrick J. Wong
86e59436d4 ocfs2: charge quota for reflinked blocks
When ocfs2 shares blocks from one file to another, it's necessary to
charge that many blocks to the quota because ocfs2 tallies block charges
according to the number of blocks mapped, not the number of physical
blocks used.

Without this patch, reflinking X blocks and then CoWing all of them
causes quota usage to *decrease* by X as seen in generic/305.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Darrick J. Wong
aef73a61c0 ocfs2: fix bad pointer cast
generic/188 triggered a dmesg stack trace because the dio completion
was casting a buffer head to an on-disk inode, which is whacky.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Darrick J. Wong
dbf896fc28 ocfs2: always unlock when completing dio writes
Always unlock the inode when completing dio writes, even if an error
has occurrred.  The caller already checks the inode and unlocks it
if needed, so we might as well reduce contention.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Darrick J. Wong
085549553d ocfs2: don't eat io errors during _dio_end_io_write
ocfs2_dio_end_io_write eats whatever errors may happen,
which means that write errors do not propagate to userspace.
Fix that.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Darrick J. Wong
3e10b793fc ocfs2: budget for extent tree splits when adding refcount flag
When we're adding the refcount flag to an extent, we have to budget
enough space to handle a full extent btree split in addition to
whatever modifications have to be made to the refcount btree.  We
don't currently do this, with the result that generic/186 crashes
when we need an extent split but not a refcount split because meta_ac
never gets allocated.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Darrick J. Wong
06a7030581 ocfs2: prohibit refcounted swapfiles
The swapfile mechanism calls bmap once to find all the swap file
mappings, which means that we cannot properly support CoW remapping.
Therefore, error out if the swap code tries to call bmap on a
refcounted file.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Darrick J. Wong
86544fbd85 ocfs2: add newlines to some error messages
These two error messages are missing the trailing newline.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Darrick J. Wong
84e40080bd ocfs2: convert inode refcount test to a helper
Replace the open-coded inode refcount flag test with a helper function
to reduce the potential for bugs.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-10 12:39:45 -08:00
Al Viro
04fff6416c simple_write_end(): don't zero in short copy into uptodate
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-10 14:25:19 -05:00
Al Viro
92e50d2d42 exofs: don't mess with simple_write_{begin,end}
... and don't zero anything on short copy; just unlock
and return 0 if that has happened on non-uptodate page.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-10 14:25:19 -05:00
Al Viro
77469c3f57 9p: saner ->write_end() on failing copy into non-uptodate page
If we had a short copy into an uptodate page, there's no reason
whatsoever to zero anything; OTOH, if that page had _not_ been
uptodate, we must have been trying to overwrite it completely
and got a short copy.  In that case, overwriting the end with
zeroes, marking uptodate and sending to server is just plain
wrong.  Just unlock, keep it non-uptodate and return 0.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-10 14:25:18 -05:00
Al Viro
43388b21e7 fix gfs2_stuffed_write_end() on short copies
a) the page is uptodate - ->write_begin() would either fail (in which
case we don't reach ->write_end()), or unstuff the inode, or find the
page already uptodate, or do a successful call of stuffed_readpage(),
which would've made it uptodate

b) zeroing the tail in pagecache is wrong.  kill -9 at the right time
while writing unmodified file contents to the same file should _not_
leave us in a situation when read() from the file will be reporting
it full of zeroes.  Especially since that effect will be transient -
at some later point the page will be evicted and then we'll be back
to the real file contents.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-10 14:25:18 -05:00
Al Viro
b9de313cf0 fix ceph_write_end()
don't zero on short copies; if the page was uptodate it's just plain
wrong, and if it wasn't we'll be better off just returning 0 and
buggering off.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-10 14:24:45 -05:00
Dan Carpenter
578620f451 ext4: return -ENOMEM instead of success
We should set the error code if kzalloc() fails.

Fixes: 67cf5b09a4 ("ext4: add the basic function for inline data support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2016-12-10 09:56:01 -05:00
Darrick J. Wong
7e6e1ef48f ext4: reject inodes with negative size
Don't load an inode with a negative size; this causes integer overflow
problems in the VFS.

[ Added EXT4_ERROR_INODE() to mark file system as corrupted. -TYT]

Fixes: a48380f769 (ext4: rename i_dir_acl to i_size_high)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2016-12-10 09:55:01 -05:00
Andreas Gruenbacher
dff25ddb48 nfs: add support for the umask attribute
Clients can set the umask attribute when creating files to cause the
server to apply it always except when inheriting permissions from the
parent directory.  That way, the new files will end up with the same
permissions as files created locally.

See https://tools.ietf.org/html/draft-ietf-nfsv4-umask-02 for more details.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 23:47:10 -05:00
Al Viro
c0cf3ef5e0 nfs_write_end(): fix handling of short copies
What matters when deciding if we should make a page uptodate is
not how much we _wanted_ to copy, but how much we actually have
copied.  As it is, on architectures that do not zero tail on
short copy we can leave uninitialized data in page marked uptodate.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-09 22:41:47 -05:00
Trond Myklebust
d9152114f7 pNFS/flexfiles: Ensure we have enough buffer for layoutreturn
The flexfiles client can piggyback both layout errors and layoutstats
as part of the layoutreturn. Both these payloads can get large, with
20 layout error entries taking up about 1.2K, and 4 layoutstats entries
taking up another 1K.
This patch allows a maximum payload of 4k by allocating a full page.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 20:26:59 -05:00
Trond Myklebust
5ba6a09e92 pNFS/flexfiles: Remove a redundant parameter in ff_layout_encode_ioerr()
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-09 20:26:58 -05:00
Darrick J. Wong
876bec6f9b vfs: refactor clone/dedupe_file_range common functions
Hoist both the XFS reflink inode state and preparation code and the XFS
file blocks compare functions into the VFS so that ocfs2 can take
advantage of it for reflink and dedupe.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2016-12-09 16:18:30 -08:00
Christoph Hellwig
a76b5b0437 fs: try to clone files first in vfs_copy_file_range
A clone is a perfectly fine implementation of a file copy, so most
file systems just implement the copy that way.  Instead of duplicating
this logic move it to the VFS.  Currently btrfs and XFS implement copies
the same way as clones and there is no behavior change for them, cifs
only implements clones and grow support for copy_file_range with this
patch.  NFS implements both, so this will allow copy_file_range to work
on servers that only implement CLONE and be lot more efficient on servers
that implements CLONE and COPY.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2016-12-09 16:17:19 -08:00
Linus Torvalds
af9468db44 Merge tag 'ceph-for-4.9-rc9' of git://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
 "A fix for an issue with ->d_revalidate() in ceph, causing frequent
  kernel crashes.

  Marked for stable - it goes back to 4.6, but started popping up only
  in 4.8"

* tag 'ceph-for-4.9-rc9' of git://github.com/ceph/ceph-client:
  ceph: don't set req->r_locked_dir in ceph_d_revalidate
2016-12-09 11:02:40 -08:00
Miklos Szeredi
d16744ec8a vfs: make generic_readlink() static
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09 16:45:04 +01:00
Miklos Szeredi
dfeef68862 vfs: remove ".readlink = generic_readlink" assignments
If .readlink == NULL implies generic_readlink().

Generated by:

to_del="\.readlink.*=.*generic_readlink"
for i in `git grep -l $to_del`; do sed -i "/$to_del"/d $i; done

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09 16:45:04 +01:00
Miklos Szeredi
76fca90e9f vfs: default to generic_readlink()
If i_op->readlink is NULL, but i_op->get_link is set then vfs_readlink()
defaults to calling generic_readlink().

The IOP_DEFAULT_READLINK flag indicates that the above conditions are met
and the default action can be taken.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09 16:45:04 +01:00
Miklos Szeredi
fd4a0edf2a vfs: replace calling i_op->readlink with vfs_readlink()
Also check d_is_symlink() in callers instead of inode->i_op->readlink
because following patches will allow NULL ->readlink for symlinks.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09 16:45:04 +01:00
Miklos Szeredi
2a07a1f5ab proc/self: use generic_readlink
The /proc/self and /proc/self-thread symlinks have separate but identical
functionality for reading and following.  This cleanup utilizes
generic_readlink to remove the duplication.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2016-12-09 16:45:03 +01:00
Miklos Szeredi
6c988f5759 ecryptfs: use vfs_get_link()
Here again we are copying form one buffer to another, while jumping through
hoops to make kernel memory look like userspace memory.

For no good reason, since vfs_get_link() provides exactly what is needed.

As a bonus, now the security hook for readlink is also called on the
underlying inode.

Note: this can be called from link-following context.  But this is okay:

 - not in RCU mode

 - commit e54ad7f1ee ("proc: prevent stacking filesystems on top")

 - ecryptfs is *reading* the underlying symlink not following it, so the
   right security hook is being called

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: Tyler Hicks <tyhicks@canonical.com>
2016-12-09 16:45:03 +01:00
Chris Mason
e5d6b12fe1 Btrfs: don't WARN() in btrfs_transaction_abort() for IO errors
btrfs_transaction_abort() has a WARN() to help us nail down whatever
problem lead to the abort.  But most of the time, we're aborting for EIO,
and the warning just adds noise.

Signed-off-by: Chris Mason <clm@fb.com>
2016-12-09 06:00:28 -08:00
Miklos Szeredi
3f9ca75516 bad_inode: add missing i_op initializers
New inode operations were forgotten to be added to bad_inode.  Most of the
time the op is checked for NULL before being called but marking the inode
bad and the check can race (very unlikely).

However in case of ->get_link() only DCACHE_SYMLINK_TYPE is checked before
calling the op, so there's no race and will definitely oops when trying to
follow links on such a beast.

Also remove comments about extinct ops.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org>
2016-12-09 11:57:43 +01:00
Dave Chinner
9807b773da Merge branch 'xfs-4.10-misc-fixes-4' into for-next 2016-12-09 16:56:26 +11:00
Eric Sandeen
9875258ca7 xfs: nuke unused tracepoint definitions
This is all unused code, so remove it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-12-09 16:49:54 +11:00
Darrick J. Wong
b24a978c37 xfs: use GPF_NOFS when allocating btree cursors
Use NOFS for allocating btree cursors, since they can be called
under the ilock.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-12-09 16:49:54 +11:00
Eryu Guan
0c187dc508 xfs: use xfs_vn_setattr_size to check on new size
Commit 6552321831 ("xfs: remove i_iolock and use i_rwsem in the
VFS inode instead") introduced a regression that truncate(2) doesn't
check on new size, so it succeeds even if the new size exceeds the
current resource limit. Because xfs_setattr_size() was used instead
of xfs_vn_setattr_size(), and the latter calls xfs_vn_change_ok()
first to do sanity check on permission and new size.

This is found by truncate03 test from ltp, and the following is a
simplified reproducer:

  #!/bin/bash
  dev=/dev/sda5
  mnt=/mnt/xfs

  mkfs -t xfs -f $dev
  mount $dev $mnt

  # set max file size to 16k
  ulimit -f 16
  truncate -s $((16 * 1024 + 1)) /mnt/xfs/testfile
  [ $? -eq 0 ] && echo "FAIL: truncate exceeded max file size"
  ulimit -f unlimited
  umount $mnt

Signed-off-by: Eryu Guan <eguan@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-12-09 16:49:54 +11:00
Dave Chinner
4cf4573d89 xfs: deprecate barrier/nobarrier mount option
We always perform integrity operations now, so these mount options
don't do anything. Deprecate them and mark them for removal in
in a year.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-12-09 16:49:54 +11:00
Dave Chinner
2291dab2c9 xfs: Always flush caches when integrity is required
There is no reason anymore for not issuing device integrity
operations when teh filesystem requires ordering or data integrity
guarantees. We should always issue cache flushes and FUA writes
where necessary and let the underlying storage optimise them as
necessary for correct integrity operation.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-12-09 16:49:54 +11:00
Eric Sandeen
2e1d23370e xfs: ignore leaf attr ichdr.count in verifier during log replay
When we create a new attribute, we first create a shortform
attribute, and try to fit the new attribute into it.
If that fails, we copy the (empty) attribute into a leaf attribute,
and do the copy again.  Thus there can be a transient state where
we have an empty leaf attribute.

If we encounter this during log replay, the verifier will fail.
So add a test to ignore this part of the leaf attr verification
during log replay.

Thanks as usual to dchinner for spotting the problem.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-12-09 16:49:47 +11:00
Fred Isaman
65990d1afb pNFS/flexfiles: Fix a deadlock on LAYOUTGET
We encountered a deadlock where the SEQUENCE that accompanied the
LAYOUTGET triggered a session drain, while ff_layout_alloc_lseg
triggered a GETDEVICEINFO.  The GETDEVICEINFO hung waiting for the
session drain, while the LAYOUTGET held the slot waiting for
alloc_lseg to finish.
  Avoid this by moving the call to nfs4_find_get_deviceid out of
ff_layout_alloc_lseg and into nfs4_ff_layout_prepare_ds.

Signed-off-by: Fred Isaman <fred.isaman@gmail.com>
[dros@primarydata.com: pNFS/flexfiles: fix races in ff_layout_mirror_valid]
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-08 21:49:57 -05:00
Jeff Layton
c3f4688a08 ceph: don't set req->r_locked_dir in ceph_d_revalidate
This function sets req->r_locked_dir which is supposed to indicate to
ceph_fill_trace that the parent's i_rwsem is locked for write.
Unfortunately, there is no guarantee that the dir will be locked when
d_revalidate is called, so we really don't want ceph_fill_trace to do
any dcache manipulation from this context. Clear req->r_locked_dir since
it's clearly not safe to do that.

What we really want to know with d_revalidate is whether the dentry
still points to the same inode. ceph_fill_trace installs a pointer to
the inode in req->r_target_inode, so we can just compare that to
d_inode(dentry) to see if it's the same one after the lookup.

Also, since we aren't generally interested in the parent here, we can
switch to using a GETATTR to hint that to the MDS, which also means that
we only need to reserve one cap.

Finally, just remove the d_unhashed check. That's really outside the
purview of a filesystem's d_revalidate. If the thing became unhashed
while we're checking it, then that's up to the VFS to handle anyway.

Fixes: 200fd27c8f ("ceph: use lookup request to revalidate dentry")
Link: http://tracker.ceph.com/issues/18041
Reported-by: Donatas Abraitis <donatas.abraitis@gmail.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-12-08 14:32:16 +01:00
Jaegeuk Kim
5eba8c5d1f f2fs: fix to access nullified flush_cmd_control pointer
f2fs_sync_file()             remount_ro
 - f2fs_readonly
                               - destroy_flush_cmd_control
 - f2fs_issue_flush
   - no fcc pointer!

So, this patch doesn't free fcc in this case, but just stop its kernel thread
which sends flush commands.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-07 18:56:50 -08:00
Li Wang
64d2ab32ef vfs: fix put_compat_statfs64() does not handle errors
put_compat_statfs64() does NOT return -1 and setting errno to EOVERFLOW
when some variables(like: f_bsize) overflowed in the returned struct.

The reason is that the ubuf->f_blocks is __u64 type, it couldn't be
4bits as the judgement in put_comat_statfs64(). Here correct the
__u32 variables(in struct compat_statfs64) for comparison.

reproducer:
step1. mount hugetlbfs with two different pagesize on ppc64 arch.

$ hugeadm --pool-pages-max 16M:0
$ hugeadm --create-mount
$ mount | grep -i hugetlbfs
none on /var/lib/hugetlbfs/pagesize-16MB type hugetlbfs (rw,relatime,seclabel,pagesize=16777216)
none on /var/lib/hugetlbfs/pagesize-16GB type hugetlbfs (rw,relatime,seclabel,pagesize=17179869184)

step2. compile & run this C program.

$ cat statfs64_test.c

 #define _LARGEFILE64_SOURCE
 #include <stdio.h>
 #include <sys/syscall.h>
 #include <sys/statfs.h>

 int main()
 {
	struct statfs64 sb;
	int err;

	err = syscall(SYS_statfs64, "/var/lib/hugetlbfs/pagesize-16GB", sizeof(sb), &sb);
	if (err)
		return -1;

	printf("sizeof f_bsize = %d, f_bsize=%ld\n", sizeof(sb.f_bsize), sb.f_bsize);

	return 0;
 }

$ gcc -m32 statfs64_test.c
$ ./a.out
sizeof f_bsize = 4, f_bsize=0

Signed-off-by: Li Wang <liwang@redhat.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-12-07 17:44:38 -05:00
Jaegeuk Kim
a2125ff7dd f2fs: free meta pages if sanity check for ckpt is failed
This fixes missing freeing meta pages in the error case.

Tested-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-07 14:38:16 -08:00
Jaegeuk Kim
2040fce83f f2fs: detect wrong layout
Previous mkfs.f2fs allows small partition inappropriately, so f2fs should detect
that as well.

Refer this in f2fs-tools.

mkfs.f2fs: detect small partition by overprovision ratio and # of segments

Reported-and-Tested-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-07 14:37:33 -08:00
Trond Myklebust
2f065ddb64 pNFS: Layoutreturn must free the layout after the layout-private data
The layout-private data may depend on the layout and/or the inode
still existing when it does post-processing and frees its data, so we
need to free them after calling lrp->ld_private.ops->free().

This fixes a mirror list corruption issue in the flexfiles driver.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07 13:41:59 -05:00
Trond Myklebust
cb06793517 pNFS/flexfiles: Fix ff_layout_add_ds_error_locked()
When we're merging an old entry into our new entry, we want to ensure that
we add the list entry in the correct place.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07 13:41:58 -05:00
NeilBrown
7a0566b38c NFSv4: Add missing nfs_put_lock_context()
Otherwise the lock context won't be freed when we're done with it.

From: NeilBrown <neilb@suse.com>
Fixes: 5bd3f817 ("NFSv4: change nfs4_select_rw_stateid to take a lock_context inplace of lock_owner")
Signed-off-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-12-07 13:41:58 -05:00