Commit Graph

49262 Commits

Author SHA1 Message Date
Jeff Layton
9670079f5f ceph: move xattr initialzation before the encoding past the ceph_mds_caps
Just for clarity. This part is inside the header, so it makes sense to
group it with the rest of the stuff in the header.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
2016-12-12 23:54:28 +01:00
Jeff Layton
4945a08479 ceph: fix minor typo in unsafe_request_wait
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
2016-12-12 23:54:27 +01:00
Yan, Zheng
5f743e4566 ceph: record truncate size/seq for snap data writeback
Dirty snapshot data needs to be flushed unconditionally. If they
were created before truncation, writeback should use old truncate
size/seq.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-12 23:54:27 +01:00
Yan, Zheng
e9e427f0a1 ceph: check availability of mds cluster on mount
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-12 23:54:27 +01:00
Yan, Zheng
7ce469a53e ceph: fix splice read for no Fc capability case
When iov_iter type is ITER_PIPE, copy_page_to_iter() increases
the page's reference and add the page to a pipe_buffer. It also
set the pipe_buffer's ops to page_cache_pipe_buf_ops. The comfirm
callback in page_cache_pipe_buf_ops expects the page is from page
cache and uptodate, otherwise it return error.

For ceph_sync_read() case, pages are not from page cache. So we
can't call copy_page_to_iter() when iov_iter type is ITER_PIPE.
The fix is using iov_iter_get_pages_alloc() to allocate pages
for the pipe. (the code is similar to default_file_splice_read)

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-12 23:54:27 +01:00
Yan, Zheng
2b1ac852eb ceph: try getting buffer capability for readahead/fadvise
For readahead/fadvise cases, caller of ceph_readpages does not
hold buffer capability. Pages can be added to page cache while
there is no buffer capability. This can cause data integrity
issue.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-12 23:54:27 +01:00
Nikolay Borisov
5c341ee328 ceph: fix scheduler warning due to nested blocking
try_get_cap_refs can be used as a condition in a wait_event* calls.
This is all fine until it has to call __ceph_do_pending_vmtruncate,
which in turn acquires the i_truncate_mutex. This leads to a situation
in which a task's state is !TASK_RUNNING and at the same time it's
trying to acquire a sleeping primitive. In essence a nested sleeping
primitives are being used. This causes the following warning:

WARNING: CPU: 22 PID: 11064 at kernel/sched/core.c:7631 __might_sleep+0x9f/0xb0()
do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff8109447d>] prepare_to_wait_event+0x5d/0x110
 ipmi_msghandler tcp_scalable ib_qib dca ib_mad ib_core ib_addr ipv6
CPU: 22 PID: 11064 Comm: fs_checker.pl Tainted: G           O    4.4.20-clouder2 #6
Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1a 10/16/2015
 0000000000000000 ffff8838b416fa88 ffffffff812f4409 ffff8838b416fad0
 ffffffff81a034f2 ffff8838b416fac0 ffffffff81052b46 ffffffff81a0432c
 0000000000000061 0000000000000000 0000000000000000 ffff88167bda54a0
Call Trace:
 [<ffffffff812f4409>] dump_stack+0x67/0x9e
 [<ffffffff81052b46>] warn_slowpath_common+0x86/0xc0
 [<ffffffff81052bcc>] warn_slowpath_fmt+0x4c/0x50
 [<ffffffff8109447d>] ? prepare_to_wait_event+0x5d/0x110
 [<ffffffff8109447d>] ? prepare_to_wait_event+0x5d/0x110
 [<ffffffff8107767f>] __might_sleep+0x9f/0xb0
 [<ffffffff81612d30>] mutex_lock+0x20/0x40
 [<ffffffffa04eea14>] __ceph_do_pending_vmtruncate+0x44/0x1a0 [ceph]
 [<ffffffffa04fa692>] try_get_cap_refs+0xa2/0x320 [ceph]
 [<ffffffffa04fd6f5>] ceph_get_caps+0x255/0x2b0 [ceph]
 [<ffffffff81094370>] ? wait_woken+0xb0/0xb0
 [<ffffffffa04f2c11>] ceph_write_iter+0x2b1/0xde0 [ceph]
 [<ffffffff81613f22>] ? schedule_timeout+0x202/0x260
 [<ffffffff8117f01a>] ? kmem_cache_free+0x1ea/0x200
 [<ffffffff811b46ce>] ? iput+0x9e/0x230
 [<ffffffff81077632>] ? __might_sleep+0x52/0xb0
 [<ffffffff81156147>] ? __might_fault+0x37/0x40
 [<ffffffff8119e123>] ? cp_new_stat+0x153/0x170
 [<ffffffff81198cfa>] __vfs_write+0xaa/0xe0
 [<ffffffff81199369>] vfs_write+0xa9/0x190
 [<ffffffff811b6d01>] ? set_close_on_exec+0x31/0x70
 [<ffffffff8119a056>] SyS_write+0x46/0xa0

This happens since wait_event_interruptible can interfere with the
mutex locking code, since they both fiddle with the task state.

Fix the issue by using the newly-added nested blocking infrastructure
in 61ada528de ("sched/wait: Provide infrastructure to deal with
nested blocking")

Link: https://lwn.net/Articles/628628/
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2016-12-12 23:54:27 +01:00
Zhi Zhang
a380a031cb ceph: fix printing wrong return variable in ceph_direct_read_write()
Fix printing wrong return variable for invalidate_inode_pages2_range in
ceph_direct_read_write().

Signed-off-by: Zhi Zhang <zhang.david2011@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2016-12-12 23:54:27 +01:00
Ilya Dryomov
0dde584882 libceph: drop len argument of *verify_authorizer_reply()
The length of the reply is protocol-dependent - for cephx it's
ceph_x_authorize_reply.  Nothing sensible can be passed from the
messenger layer anyway.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-12-12 23:09:21 +01:00
Richard Weinberger
fc4b891bbe ubifs: Raise write version to 5
Starting with version 5 the following properties change:
 - UBIFS_FLG_DOUBLE_HASH is mandatory
 - UBIFS_FLG_ENCRYPTION is optional but depdens on UBIFS_FLG_DOUBLE_HASH
 - Filesystems with unknown super block flags will be rejected, this
   allows us in future to add new features without raising the UBIFS
   write version.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
e021986ee4 ubifs: Implement UBIFS_FLG_ENCRYPTION
This feature flag indicates that the filesystem contains encrypted
files.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
d63d61c169 ubifs: Implement UBIFS_FLG_DOUBLE_HASH
This feature flag indicates that all directory entry nodes have a 32bit
cookie set and therefore UBIFS is allowed to perform lookups by hash.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
cc41a53652 ubifs: Use a random number for cookies
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
528e3d178f ubifs: Add full hash lookup support
UBIFS stores a 32bit hash of every file, for traditional lookups by name
this scheme is fine since UBIFS can first try to find the file by the
hash of the filename and upon collisions it can walk through all entries
with the same hash and do a string compare.
When filesnames are encrypted fscrypto will ask the filesystem for a
unique cookie, based on this cookie the filesystem has to be able to
locate the target file again. With 32bit hashes this is impossible
because the chance for collisions is very high. Do deal with that we
store a 32bit cookie directly in the UBIFS directory entry node such
that we get a 64bit cookie (32bit from filename hash and the dent
cookie). For a lookup by hash UBIFS finds the entry by the first 32bit
and then compares the dent cookie. If it does not match, it has to do a
linear search of the whole directory and compares all dent cookies until
the correct entry is found.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
b91dc9816e ubifs: Rename tnc_read_node_nm
tnc_read_hashed_node() is a better name since we read a node
by a given hash, not a name.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
ca7f85be8d ubifs: Add support for encrypted symlinks
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
f4f61d2cc6 ubifs: Implement encrypted filenames
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
b9bc8c7bdb ubifs: Make r5 hash binary string aware
As of now all filenames known by UBIFS are strings with a NUL
terminator. With encrypted filenames a filename can be any binary
string and the r5 function cannot search for the NUL terminator.
UBIFS always knows how long a filename is, therefore we can change
the hash function to iterate over the filename length to work
correctly with binary strings.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
304790c038 ubifs: Relax checks in ubifs_validate_entry()
With encrypted filenames we store raw binary data, doing
string tests is no longer possible.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
7799953b34 ubifs: Implement encrypt/decrypt for all IO
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
1ee77870c9 ubifs: Constify struct inode pointer in ubifs_crypt_is_encrypted()
...and provide a non const variant for fscrypto

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
f1f52d6b02 ubifs: Introduce new data node field, compr_size
When data of a data node is compressed and encrypted
we need to store the size of the compressed data because
before encryption we may have to add padding bytes.

For the new field we consume the last two padding bytes
in struct ubifs_data_node. Two bytes are fine because
the data length is at most 4096.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
959c2de2b3 ubifs: Enforce crypto policy in mmap
We need this extra check in mmap because a process could
gain an already opened fd.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
700eada82a ubifs: Massage assert in ubifs_xattr_set() wrt. fscrypto
When we're creating a new inode in UBIFS the inode is not
yet exposed and fscrypto calls ubifs_xattr_set() without
holding the inode mutex. This is okay but ubifs_xattr_set()
has to know about this.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
9270b2f4cd ubifs: Preload crypto context in ->lookup()
...and mark the dentry as encrypted.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
ac7e47a9ed ubifs: Enforce crypto policy in ->link and ->rename
When a file is moved or linked into another directory
its current crypto policy has to be compatible with the
target policy.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
a79bff21c1 ubifs: Implement file open operation
We need ->open() for files to load the crypto key.
If the no key is present and the file is encrypted,
refuse to open.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
ba40e6a3c4 ubifs: Implement directory open operation
We need the ->open() hook to load the crypto context
which is needed for all crypto operations within that
directory.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
43b113fea2 ubifs: Massage ubifs_listxattr() for encryption context
We have to make sure that we don't expose our internal
crypto context to userspace.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
d475a50745 ubifs: Add skeleton for fscrypto
This is the first building block to provide file level
encryption on UBIFS.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
6a5e98ab7d ubifs: Define UBIFS crypto context xattr
Like ext4 UBIFS will store the crypto context in a xattr
attribute.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
ade46c3a60 ubifs: Export xattr get and set functions
For fscrypto we need this function outside of xattr.c.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Richard Weinberger
f6337d8426 ubifs: Export ubifs_check_dir_empty()
fscrypto will need this function too. Also get struct ubifs_info
from the provided inode. Not all callers will have a reference to
struct ubifs_info.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:07:38 +01:00
Christophe Jaillet
d40a796217 ubifs: Remove some dead code
'ubifs_fast_find_freeable()' can not return an error pointer, so this test
can be removed.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:28 +01:00
Rafał Miłecki
1b7fc2c006 ubifs: Use dirty_writeback_interval value for wbuf timer
Right now wbuf timer has hardcoded timeouts and there is no place for
manual adjustments. Some projects / cases many need that though. Few
file systems allow doing that by respecting dirty_writeback_interval
that can be set using sysctl (dirty_writeback_centisecs).

Lowering dirty_writeback_interval could be some way of dealing with user
space apps lacking proper fsyncs. This is definitely *not* a perfect
solution but we don't have ideal (user space) world. There were already
advanced discussions on this matter, mostly when ext4 was introduced and
it wasn't behaving as ext3. Anyway, the final decision was to add some
hacks to the ext4, as trying to fix whole user space or adding new API
was pointless.

We can't (and shouldn't?) just follow ext4. We can't e.g. sync on close
as this would cause too many commits and flash wearing. On the other
hand we still should allow some trade-off between -o sync and default
wbuf timeout. Respecting dirty_writeback_interval should allow some sane
cutomizations if used warily.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:24 +01:00
Rafał Miłecki
854826c9d5 ubifs: Drop softlimit and delta fields from struct ubifs_wbuf
Values of these fields are set during init and never modified. They are
used (read) in a single function only. There isn't really any reason to
keep them in a struct. It only makes struct just a bit bigger without
any visible gain.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-12-12 23:06:11 +01:00
Yunlei He
c0ed4405a9 f2fs: fix a missing size change in f2fs_setattr
This patch fix a missing size change in f2fs_setattr

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-12-12 11:09:05 -08:00
Christophe JAILLET
04102c76a7 orangefs: Axe some dead code
The "perf_counter_reset" case has already been handled above.
Moreover "ORANGEFS_PARAM_REQUEST_OP_READAHEAD_COUNT_SIZE" is not a really
consistent.
It is likely that this (dead) code is a cut and paste left over.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-12-12 12:49:20 -05:00
Colin Ian King
4defb5f912 orangefs: fix memory leak of string 'new' on exit path
allocates string 'new' is not free'd on the exit path when
cdm_element_count <= 0. Fix this by kfree'ing it.

Fixes CoverityScan CID#1375923 "Resource Leak"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-12-12 11:43:25 -05:00
Chris Mason
7c4c71ac8a Revert "Btrfs: adjust len of writes if following a preallocated extent"
This is exposing an existing deadlock between fsync and AIO.  Until we
have the deadlock fixed, I'm pulling this one out.

This reverts commit a23eaa875f.

Signed-off-by: Chris Mason <clm@fb.com>
2016-12-11 15:27:15 -08:00
David Gstir
6a34e4d2be fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FL
... to better explain its purpose after introducing in-place encryption
without bounce buffer.

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:33:18 -05:00
David Gstir
f32d7ac20a fscrypt: Delay bounce page pool allocation until needed
Since fscrypt users can now indicated if fscrypt_encrypt_page() should
use a bounce page, we can delay the bounce page pool initialization util
it is really needed. That is until fscrypt_operations has no
FS_CFLG_OWN_PAGES flag set.

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:33:11 -05:00
David Gstir
bd7b829038 fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page()
Rename the FS_CFLG_INPLACE_ENCRYPTION flag to FS_CFLG_OWN_PAGES which,
when set, indicates that the fs uses pages under its own control as
opposed to writeback pages which require locking and a bounce buffer for
encryption.

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:26:12 -05:00
David Gstir
1400451f04 fscrypt: Cleanup fscrypt_{decrypt,encrypt}_page()
- Improve documentation
- Add BUG_ON(len == 0) to avoid accidental switch of offs and len
parameters
- Improve variable names for readability

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:26:12 -05:00
David Gstir
9e532772b4 fscrypt: Never allocate fscrypt_ctx on in-place encryption
In case of in-place encryption fscrypt_ctx was allocated but never
released. Since we don't need it for in-place encryption, we skip
allocating it.

Fixes: 1c7dcf69ee ("fscrypt: Add in-place encryption mode")

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:26:11 -05:00
David Gstir
e550c16c8a fscrypt: Use correct index in decrypt path.
Actually use the fs-provided index instead of always using page->index
which is only set for page-cache pages.

Fixes: 9c4bb8a3a9 ("fscrypt: Let fs select encryption index/tweak")

Signed-off-by: David Gstir <david@sigma-star.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:26:10 -05:00
Theodore Ts'o
cc4e0df038 fscrypt: move non-public structures and constants to fscrypt_private.h
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Biggers <ebiggers@google.com>
2016-12-11 16:26:09 -05:00
Theodore Ts'o
b98701df34 fscrypt: unexport fscrypt_initialize()
The fscrypt_initalize() function isn't used outside fs/crypto, so
there's no point making it be an exported symbol.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Biggers <ebiggers@google.com>
2016-12-11 16:26:08 -05:00
Theodore Ts'o
3325bea5b2 fscrypt: rename get_crypt_info() to fscrypt_get_crypt_info()
To avoid namespace collisions, rename get_crypt_info() to
fscrypt_get_crypt_info().  The function is only used inside the
fs/crypto directory, so declare it in the new header file,
fscrypt_private.h.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Biggers <ebiggers@google.com>
2016-12-11 16:26:08 -05:00
Eric Biggers
db717d8e26 fscrypto: move ioctl processing more fully into common code
Multiple bugs were recently fixed in the "set encryption policy" ioctl.
To make it clear that fscrypt_process_policy() and fscrypt_get_policy()
implement ioctls and therefore their implementations must take standard
security and correctness precautions, rename them to
fscrypt_ioctl_set_policy() and fscrypt_ioctl_get_policy().  Make the
latter take in a struct file * to make it consistent with the former.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-12-11 16:26:07 -05:00