Commit Graph

45482 Commits

Author SHA1 Message Date
Josef Bacik
c83f8effef Btrfs: add tracepoint for adding block groups
I'm writing a tool to visualize the enospc system inside btrfs, I need this
tracepoint in order to keep track of the block groups in the system.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-07-07 18:45:53 +02:00
Josef Bacik
d555b6c380 Btrfs: warn_on for unaccounted spaces
These were hidden behind enospc_debug, which isn't helpful as they indicate
actual bugs, unlike the rest of the enospc_debug stuff which is really debug
information.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-07-07 18:45:53 +02:00
Josef Bacik
c48f49d63d Btrfs: change delayed reservation fallback behavior
We reserve space for the inode update when we first reserve space for writing to
a file.  However there are lots of ways that we can use this reservation and not
have it for subsequent ordered extents.  Previously we'd fall through and try to
reserve metadata bytes for this, then we'd just steal the full reservation from
the delalloc_block_rsv, and if that didn't have enough space we'd steal the full
reservation from the global reserve.  The problem with this is we can easily
just return ENOSPC and fallback to updating the inode item directly.  In the
worst case (assuming 4k nodesize) we'd steal 64kib from the global reserve if we
fall all the way through, however if we just fallback and update the inode
directly we'd only steal 4k * BTRFS_PATH_MAX in the worst case which is 32kib.

We would have also just added the extent item for the inode so we likely will
have already cow'ed down most of the way to the leaf containing the inode item,
so we are more often than not only need one or two nodesize's worth of
reservations.  Given the reservation for the extent itself is also a worst case
we will likely already have space to cover the inode update.

This change will make us behave better in the theoretical worst case, and much
better in the case that we don't have our reservation and cannot reserve more
metadata.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-07-07 18:45:53 +02:00
Josef Bacik
48c3d480e4 Btrfs: always reserve metadata for delalloc extents
There are a few races in the metadata reservation stuff.  First we add the bytes
to the block_rsv well after we've set the bit on the inode saying that we have
space for it and after we've reserved the bytes.  So use the normal
btrfs_block_rsv_add helper for this case.  Secondly we can flush delalloc
extents when we try to reserve space for our write, which means that we could
have used up the space for the inode and we wouldn't know because we only check
before the reservation.  So instead make sure we are always reserving space for
the inode update, and then if we don't need it release those bytes afterward.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-07-07 18:45:53 +02:00
Josef Bacik
25d609f86d Btrfs: fix callers of btrfs_block_rsv_migrate
So btrfs_block_rsv_migrate just unconditionally calls block_rsv_migrate_bytes.
Not only this but it unconditionally changes the size of the block_rsv.  This
isn't a bug strictly speaking, but it makes truncate block rsv's look funny
because every time we migrate bytes over its size grows, even though we only
want it to be a specific size.  So collapse this into one function that takes an
update_size argument and make truncate and evict not update the size for
consistency sake.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-07-07 18:45:53 +02:00
Josef Bacik
e40edf2da4 Btrfs: add bytes_readonly to the spaceinfo at once
For some reason we're adding bytes_readonly to the space info after we update
the space info with the block group info.  This creates a tiny race where we
could over-reserve space because we haven't yet taken out the bytes_readonly
bit.  Since we already know this information at the time we call
update_space_info, just pass it along so it can be updated all at once.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2016-07-07 18:45:53 +02:00
Ingo Molnar
4b4b20852d Merge branch 'timers/fast-wheel' into timers/core 2016-07-07 10:35:28 +02:00
Jaegeuk Kim
ac6f199984 f2fs: avoid latency-critical readahead of node pages
The f2fs_map_blocks is very related to the performance, so let's avoid any
latency to read ahead node pages.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:10 -07:00
Jaegeuk Kim
2c237ebaa4 f2fs: avoid writing node/metapages during writes
Let's keep more node/meta pages in run time.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:09 -07:00
Jaegeuk Kim
ad4edb8314 f2fs: produce more nids and reduce readahead nats
The readahead nat pages are more likely to be reclaimed quickly, so it'd better
to gather more free nids in advance.

And, let's keep some free nids as much as possible.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:08 -07:00
Jaegeuk Kim
52763a4b7a f2fs: detect host-managed SMR by feature flag
If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs
by default.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:07 -07:00
Jaegeuk Kim
67c3758d22 f2fs: call update_inode_page for orphan inodes
Let's store orphan inode pages right away.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:07 -07:00
Jaegeuk Kim
3e19886eda f2fs: report error for f2fs_parent_dir
If there is no dentry, we can report its error correctly.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2016-07-06 10:44:06 -07:00
David S. Miller
30d0844bdc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/mellanox/mlx5/core/en.h
	drivers/net/ethernet/mellanox/mlx5/core/en_main.c
	drivers/net/usb/r8152.c

All three conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-06 10:35:22 -07:00
Carlos Maiolino
ff0031d848 ext2: fix filesystem deadlock while reading corrupted xattr block
This bug can be reproducible with fsfuzzer, although, I couldn't reproduce it
100% of my tries, it is quite easily reproducible.

During the deletion of an inode, ext2_xattr_delete_inode() does not check if the
block pointed by EXT2_I(inode)->i_file_acl is a valid data block, this might
lead to a deadlock, when i_file_acl == 1, and the filesystem block size is 1024.

In that situation, ext2_xattr_delete_inode, will load the superblock's buffer
head (instead of a valid i_file_acl block), and then lock that buffer head,
which, ext2_sync_super will also try to lock, making the filesystem deadlock in
the following stack trace:

root     17180  0.0  0.0 113660   660 pts/0    D+   07:08   0:00 rmdir
/media/test/dir1

[<ffffffff8125da9f>] __sync_dirty_buffer+0xaf/0x100
[<ffffffff8125db03>] sync_dirty_buffer+0x13/0x20
[<ffffffffa03f0d57>] ext2_sync_super+0xb7/0xc0 [ext2]
[<ffffffffa03f10b9>] ext2_error+0x119/0x130 [ext2]
[<ffffffffa03e9d93>] ext2_free_blocks+0x83/0x350 [ext2]
[<ffffffffa03f3d03>] ext2_xattr_delete_inode+0x173/0x190 [ext2]
[<ffffffffa03ee9e9>] ext2_evict_inode+0xc9/0x130 [ext2]
[<ffffffff8123fd23>] evict+0xb3/0x180
[<ffffffff81240008>] iput+0x1b8/0x240
[<ffffffff8123c4ac>] d_delete+0x11c/0x150
[<ffffffff8122fa7e>] vfs_rmdir+0xfe/0x120
[<ffffffff812340ee>] do_rmdir+0x17e/0x1f0
[<ffffffff81234dd6>] SyS_rmdir+0x16/0x20
[<ffffffff81838cf2>] entry_SYSCALL_64_fastpath+0x1a/0xa4
[<ffffffffffffffff>] 0xffffffffffffffff

Fix this by using the same approach ext4 uses to test data blocks validity,
implementing ext2_data_block_valid.

An another possibility when the superblock is very corrupted, is that i_file_acl
is 1, block_count is 1 and first_data_block is 0. For such situations, we might
have i_file_acl pointing to a 'valid' block, but still step over the superblock.
The approach I used was to also test if the superblock is not in the range
described by ext2_data_block_valid() arguments

Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-07-05 22:02:41 -04:00
Wang Shilong
079788d01e ext4: fix project quota accounting without quota limits enabled
We should always transfer quota accounting, regardless of whether
quota limits are enabled.

Steps to reproduce:
  # mkfs.ext4 /dev/sda4 -O quota,project
  # mount /dev/sda4 /mnt/test
  # cp /bin/bash /mnt/test
  # chattr -p 123 /mnt/test/bash
  # quota -v -P 123

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2016-07-05 21:33:52 -04:00
Theodore Ts'o
5b9554dc5b ext4: validate s_reserved_gdt_blocks on mount
If s_reserved_gdt_blocks is extremely large, it's possible for
ext4_init_block_bitmap(), which is called when ext4 sets up an
uninitialized block bitmap, to corrupt random kernel memory.  Add the
same checks which e2fsck has --- it must never be larger than
blocksize / sizeof(__u32) --- and then add a backup check in
ext4_init_block_bitmap() in case the superblock gets modified after
the file system is mounted.

Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
2016-07-05 20:01:52 -04:00
Trond Myklebust
9a773e7c8d NFS nfs_vm_page_mkwrite: Don't freeze me, Bro...
Prevent filesystem freezes while handling the write page fault.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:08 -04:00
Trond Myklebust
e95fc4a069 NFSv4.2: llseek(SEEK_HOLE) and llseek(SEEK_DATA) don't require data sync
We want to ensure that we write the cached data to the server, but
don't require it be synced to disk. If the server reboots, we will
get a stateid error, which will cause us to retry anyway.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:08 -04:00
Trond Myklebust
837bb1d752 NFSv4.2: Fix writeback races in nfs4_copy_file_range
We need to ensure that any writes to the destination file are serialised
with the copy, meaning that the writeback has to occur under the inode lock.

Also relax the writeback requirement on the source, and rely on the
stateid checking to tell us if the source rebooted. Add the helper
nfs_filemap_write_and_wait_range() to call pnfs_sync_inode() as
is appropriate for pNFS servers that may need a layoutcommit.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:07 -04:00
Trond Myklebust
1e564d3dbd NFSv4.2: Fix a race in nfs42_proc_deallocate()
When punching holes in a file, we want to ensure the operation is
serialised w.r.t. other writes, meaning that we want to call
nfs_sync_inode() while holding the inode lock.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:07 -04:00
Trond Myklebust
79566ef018 NFS: Getattr doesn't require data sync semantics
When retrieving stat() information, NFS unfortunately does require us to
sync writes to disk in order to ensure that mtime and ctime are up to
date. However we shouldn't have to ensure that those writes are persisted.

Relaxing that requirement does mean that we may see an mtime/ctime change
if the server reboots and forces us to replay all writes.

The exception to this rule are pNFS clients that are required to send
layoutcommit, however that is dealt with by the call to pnfs_sync_inode()
in _nfs_revalidate_inode().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:06 -04:00
Trond Myklebust
651b0e7029 NFS: Do not aggressively cache file attributes in the case of O_DIRECT
A file that is open for O_DIRECT is by definition not obeying
close-to-open cache consistency semantics, so let's not cache
the attributes too aggressively either.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:06 -04:00
Trond Myklebust
be527494e0 NFS: Remove unused function nfs_revalidate_mapping_protected()
Clean up...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:05 -04:00
Trond Myklebust
f508d46ae4 NFS: Remove redundant waits for O_DIRECT in fsync() and write_begin()
We're now waiting immediately after taking the locks, so waiting
in fsync() and write_begin() is either redundant or potentially
subject to livelock (if not holding the lock).

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:05 -04:00
Trond Myklebust
f7b5c340ac NFS: Cleanup nfs_direct_complete()
There is only one caller that sets the "write" argument to true,
so just move the call to nfs_zap_mapping() and get rid of the
now redundant argument.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:04 -04:00
Trond Myklebust
a5864c999d NFS: Do not serialise O_DIRECT reads and writes
Allow dio requests to be scheduled in parallel, but ensuring that they
do not conflict with buffered I/O.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:04 -04:00
Trond Myklebust
18290650b1 NFS: Move buffered I/O locking into nfs_file_write()
Preparation for the patch that de-serialises O_DIRECT reads and
writes.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:03 -04:00
Trond Myklebust
89698b24d2 NFS Cleanup: move call to generic_write_checks() into fs/nfs/direct.c
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:03 -04:00
Trond Myklebust
2f3c7d87a3 NFS: Remove racy size manipulations in O_DIRECT
On success, the RPC callbacks will ensure that we make the appropriate calls
to nfs_writeback_update_inode()

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:02 -04:00
Trond Myklebust
a5314a7492 NFS: Ensure we reset the write verifier 'committed' value on resend.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:02 -04:00
Trond Myklebust
8fc3c38627 NFS: Fix O_DIRECT verifier problems
We should not be interested in looking at the value of the stable field,
since that could take any value.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:01 -04:00
Trond Myklebust
6712007734 pNFS: pnfs_layoutcommit_outstanding() is no longer used when !CONFIG_NFS_V4_1
Cleanup...

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:11:01 -04:00
Trond Myklebust
ac46bd374c pNFS: Ensure we layoutcommit before revalidating attributes
If we need to update the cached attributes, then we'd better make
sure that we also layoutcommit first. Otherwise, the server may have stale
attributes.

Prior to this patch, the revalidation code tried to "fix" this problem by
simply disabling attributes that would be affected by the layoutcommit.
That approach breaks nfs_writeback_check_extend(), leading to a file size
corruption.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:08:27 -04:00
Trond Myklebust
2e18d4d822 pNFS: Files and flexfiles always need to commit before layoutcommit
So ensure that we mark the layout for commit once the write is done,
and then ensure that the commit to ds is finished before sending
layoutcommit.

Note that by doing this, we're able to optimise away the commit
for the case of servers that don't need layoutcommit in order to
return updated attributes.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 19:08:01 -04:00
Trond Myklebust
bc28e1c2e3 pNFS/flexfiles: Clean up calls to pnfs_set_layoutcommit()
Let's just have one place where we check ff_layout_need_layoutcommit().

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 18:52:26 -04:00
Trond Myklebust
c001c87a63 pNFS/flexfiles: Fix layoutcommit after a commit to DS
We should always do a layoutcommit after commit to DS, except if
the layout segment we're using has set FF_FLAGS_NO_LAYOUTCOMMIT.

Fixes: d67ae825a5 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 18:52:26 -04:00
Trond Myklebust
73e6c5d854 pNFS/files: Fix layoutcommit after a commit to DS
According to the errata
https://www.rfc-editor.org/errata_search.php?rfc=5661&eid=2751
we should always send layout commit after a commit to DS.

Fixes: bc7d4b8fd0 ("nfs/filelayout: set layoutcommit...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2016-07-05 18:52:25 -04:00
yalin wang
de9e9181bc ext4: remove unused page_idx
Signed-off-by: yalin wang <yalin.wang2010@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.com>
2016-07-05 16:32:32 -04:00
Eric W. Biederman
5c0048280b dquot: For now explicitly don't support filesystems outside of init_user_ns
Mostly supporting filesystems outside of init_user_ns is
s/&init_usre_ns/dquot->dq_sb->s_user_ns/.  An actual need for
supporting quotas on filesystems outside of s_user_ns is quite a ways
away and to be done responsibily needs an audit on what can happen
with hostile quota files.  Until that audit is complete don't attempt
to support quota files on filesystems outside of s_user_ns.

Cc: Jan Kara <jack@suse.cz>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-07-05 15:12:39 -05:00
Eric W. Biederman
cfd4c70a18 quota: Handle quota data stored in s_user_ns in quota_setxquota
In Q_XSETQLIMIT use sb->s_user_ns to detect when we are dealing with
the filesystems notion of id 0.

Cc: Jan Kara <jack@suse.cz>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Inspired-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-07-05 15:12:20 -05:00
Eric W. Biederman
d49d37624a quota: Ensure qids map to the filesystem
Introduce the helper qid_has_mapping and use it to ensure that the
quota system only considers qids that map to the filesystems
s_user_ns.

In practice for quota supporting filesystems today this is the exact
same check as qid_valid.  As only 0xffffffff aka (qid_t)-1 does not
map into init_user_ns.

Replace the qid_valid calls with qid_has_mapping as values come in
from userspace.  This is harmless today and it prepares the quota
system to work on filesystems with quotas but mounted by unprivileged
users.

Call qid_has_mapping from dqget.  This ensures the passed in qid has a
prepresentation on the underlying filesystem.  Previously this was
unnecessary as filesystesm never had qids that could not map.  With
the introduction of filesystems outside of s_user_ns this will not
remain true.

All of this ensures the quota code never has to deal with qids that
don't map to the underlying filesystem.

Cc: Jan Kara <jack@suse.cz>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-07-05 15:12:05 -05:00
Eric W. Biederman
036d523641 vfs: Don't create inodes with a uid or gid unknown to the vfs
It is expected that filesystems can not represent uids and gids from
outside of their user namespace.  Keep things simple by not even
trying to create filesystem nodes with non-sense uids and gids.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-07-05 15:11:47 -05:00
Eric W. Biederman
0bd23d09b8 vfs: Don't modify inodes with a uid or gid unknown to the vfs
When a filesystem outside of init_user_ns is mounted it could have
uids and gids stored in it that do not map to init_user_ns.

The plan is to allow those filesystems to set i_uid to INVALID_UID and
i_gid to INVALID_GID for unmapped uids and gids and then to handle
that strange case in the vfs to ensure there is consistent robust
handling of the weirdness.

Upon a careful review of the vfs and filesystems about the only case
where there is any possibility of confusion or trouble is when the
inode is written back to disk.  In that case filesystems typically
read the inode->i_uid and inode->i_gid and write them to disk even
when just an inode timestamp is being updated.

Which leads to a rule that is very simple to implement and understand
inodes whose i_uid or i_gid is not valid may not be written.

In dealing with access times this means treat those inodes as if the
inode flag S_NOATIME was set.  Reads of the inodes appear safe and
useful, but any write or modification is disallowed.  The only inode
write that is allowed is a chown that sets the uid and gid on the
inode to valid values.  After such a chown the inode is normal and may
be treated as such.

Denying all writes to inodes with uids or gids unknown to the vfs also
prevents several oddball cases where corruption would have occurred
because the vfs does not have complete information.

One problem case that is prevented is attempting to use the gid of a
directory for new inodes where the directories sgid bit is set but the
directories gid is not mapped.

Another problem case avoided is attempting to update the evm hash
after setxattr, removexattr, and setattr.  As the evm hash includeds
the inode->i_uid or inode->i_gid not knowning the uid or gid prevents
a correct evm hash from being computed.  evm hash verification also
fails when i_uid or i_gid is unknown but that is essentially harmless
as it does not cause filesystem corruption.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2016-07-05 15:06:46 -05:00
Al Viro
c94c09535c nfs_atomic_open(): prevent parallel nfs_lookup() on a negative hashed
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-05 16:02:31 -04:00
Al Viro
00699ad857 Use the right predicate in ->atomic_open() instances
->atomic_open() can be given an in-lookup dentry *or* a negative one
found in dcache.  Use d_in_lookup() to tell one from another, rather
than d_unhashed().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-07-05 16:02:23 -04:00
Jann Horn
78fee0b684 orangefs: fix namespace handling
In orangefs_inode_getxattr(), an fsuid is written to dmesg. The kuid is
converted to a userspace uid via from_kuid(current_user_ns(), [...]), but
since dmesg is global, init_user_ns should be used here instead.

In copy_attributes_from_inode(), op_alloc() and fill_default_sys_attrs(),
upcall structures are populated with uids/gids that have been mapped into
the caller's namespace. However, those upcall structures are read by
another process (the userspace filesystem driver), and that process might
be running in another namespace. This effectively lets any user spoof its
uid and gid as seen by the userspace filesystem driver.

To fix the second issue, I just construct the opcall structures with
init_user_ns uids/gids and require the filesystem server to run in the
init namespace. Since orangefs is full of global state anyway (as the error
message in DUMP_DEVICE_ERROR explains, there can only be one userspace
orangefs filesystem driver at once), that shouldn't be a problem.

[
Why does orangefs even exist in the kernel if everything does upcalls into
userspace? What does orangefs do that couldn't be done with the FUSE
interface? If there is no good answer to those questions, I'd prefer to see
orangefs kicked out of the kernel. Can that be done for something that
shipped in a release?

According to commit f7ab093f74 ("Orangefs: kernel client part 1"), they
even already have a FUSE daemon, and the only rational reason (apart from
"but most of our users report preferring to use our kernel module instead")
given for not wanting to use FUSE is one "in-the-works" feature that could
probably be integated into FUSE instead.
]

This patch has been compile-tested.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-07-05 15:47:43 -04:00
Mike Marshall
3903f15008 Orangefs: allow O_DIRECT in open
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-07-05 15:47:35 -04:00
Andreas Gruenbacher
d373a712c1 orangefs: Remove useless xattr prefix arguments
Mike,

On Fri, Jun 3, 2016 at 9:44 PM, Mike Marshall <hubcap@omnibond.com> wrote:
> We use the return value in this one line you changed, our userspace code gets
> ill when we send it (-ENOMEM +1) as a key length...

ah, my mistake.  Here's a fixed version.

Thanks,
Andreas

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-07-05 15:47:27 -04:00
Andreas Gruenbacher
2ce8272a10 orangefs: Remove redundant "trusted." xattr handler
Orangefs has a catch-all xattr handler that effectively does what the
trusted handler does already.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
2016-07-05 15:47:22 -04:00