Hook the callback channel into the same session management machinery
as we use for the forward channel.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
We have no duplicate reply cache, so we always set the back channel
ca_maxresponsesize_cached to zero when negotiating the session.
That means we should always error out as soon as we see the server
set args->csa_cachethis.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
See RFC5661 Section 2.10.6.2: if retrying a request, and the old one is
still in progress, we must return NFS4ERR_DELAY as the reply to sequence.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Pull MIPS updates from Ralf Baechle:
"This is the main pull request for MIPS for 4.5 plus some 4.4 fixes.
The executive summary:
- ATH79 platform improvments, use DT bindings for the ATH79 USB PHY.
- Avoid useless rebuilds for zboot.
- jz4780: Add NEMC, BCH and NAND device tree nodes
- Initial support for the MicroChip's DT platform. As all the device
drivers are missing this is still of limited use.
- Some Loongson3 cleanups.
- The unavoidable whitespace polishing.
- Reduce clock skew when synchronizing the CPU cycle counters on CPU
startup.
- Add MIPS R6 fixes.
- Lots of cleanups across arch/mips as fallout from KVM.
- Lots of minor fixes and changes for IEEE 754-2008 support to the
FPU emulator / fp-assist software.
- Minor Ralink, BCM47xx and bcm963xx platform support improvments.
- Support SMP on BCM63168"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (84 commits)
MIPS: zboot: Add support for serial debug using the PROM
MIPS: zboot: Avoid useless rebuilds
MIPS: BMIPS: Enable ARCH_WANT_OPTIONAL_GPIOLIB
MIPS: bcm63xx: nvram: Remove unused bcm63xx_nvram_get_psi_size() function
MIPS: bcm963xx: Update bcm_tag field image_sequence
MIPS: bcm963xx: Move extended flash address to bcm_tag header file
MIPS: bcm963xx: Move Broadcom BCM963xx image tag data structure
MIPS: bcm63xx: nvram: Use nvram structure definition from header file
MIPS: bcm963xx: Add Broadcom BCM963xx board nvram data structure
MAINTAINERS: Add KVM for MIPS entry
MIPS: KVM: Add missing newline to kvm_err()
MIPS: Move KVM specific opcodes into asm/inst.h
MIPS: KVM: Use cacheops.h definitions
MIPS: Break down cacheops.h definitions
MIPS: Use EXCCODE_ constants with set_except_vector()
MIPS: Update trap codes
MIPS: Move Cause.ExcCode trap codes to mipsregs.h
MIPS: KVM: Make kvm_mips_{init,exit}() static
MIPS: KVM: Refactor added offsetof()s
MIPS: KVM: Convert EXPORT_SYMBOL to _GPL
...
Pull Ceph updates from Sage Weil:
"The two main changes are aio support in CephFS, and a series that
fixes several issues in the authentication key timeout/renewal code.
On top of that are a variety of cleanups and minor bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: remove outdated comment
libceph: kill off ceph_x_ticket_handler::validity
libceph: invalidate AUTH in addition to a service ticket
libceph: fix authorizer invalidation, take 2
libceph: clear messenger auth_retry flag if we fault
libceph: fix ceph_msg_revoke()
libceph: use list_for_each_entry_safe
ceph: use i_size_{read,write} to get/set i_size
ceph: re-send AIO write request when getting -EOLDSNAP error
ceph: Asynchronous IO support
ceph: Avoid to propagate the invalid page point
ceph: fix double page_unlock() in page_mkwrite()
rbd: delete an unnecessary check before rbd_dev_destroy()
libceph: use list_next_entry instead of list_entry_next
ceph: ceph_frag_contains_value can be boolean
ceph: remove unused functions in ceph_frag.h
Pull SMB3 fixes from Steve French:
"A collection of CIFS/SMB3 fixes.
It includes a couple bug fixes, a few for improved debugging of
cifs.ko and some improvements to the way cifs does key generation.
I do have some additional bug fixes I expect in the next week or two
(to address a problem found by xfstest, and some fixes for SMB3.11
dialect, and a couple patches that just came in yesterday that I am
reviewing)"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs_dbg() outputs an uninitialized buffer in cifs_readdir()
cifs: fix race between call_async() and reconnect()
Prepare for encryption support (first part). Add decryption and encryption key generation. Thanks to Metze for helping with this.
cifs: Allow using O_DIRECT with cache=loose
cifs: Make echo interval tunable
cifs: Check uniqueid for SMB2+ and return -ESTALE if necessary
Print IP address of unresponsive server
cifs: Ratelimit kernel log messages
Pull final vfs updates from Al Viro:
- The ->i_mutex wrappers (with small prereq in lustre)
- a fix for too early freeing of symlink bodies on shmem (they need to
be RCU-delayed) (-stable fodder)
- followup to dedupe stuff merged this cycle
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: abort dedupe loop if fatal signals are pending
make sure that freeing shmem fast symlinks is RCU-delayed
wrappers for ->i_mutex access
lustre: remove unused declaration
fold orangefs_op_initialize() in there, don't bother locking something
nobody else could've seen yet, use kmem_cache_zalloc() instead of
explicit memset()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
... we are not going to get woken up anyway, so it's just going to time out
and whine.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
All timeouts are in _seconds_, so all calls are of form
MSECS_TO_JIFFIES(n * 1000), which is a convoluted way to
spell n * HZ.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Pull NFS client bugfixes and cleanups from Trond Myklebust:
"Bugfixes:
- pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
- pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
Cleanups:
- NFS: Simplify nfs_request_add_commit_list() arguments"
* tag 'nfs-for-4.5-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
pNFS/flexfiles: Fix an XDR encoding bug in layoutreturn
NFS: Simplify nfs_request_add_commit_list() arguments
pNFS/flexfiles: Improve merging of errors in LAYOUTRETURN
* create with refcount 1
* make op_release() decrement and free if zero (i.e. old put_op()
has become that).
* mark when submitter has given up waiting; from that point nobody
else can move between the lists, change state, etc.
* have daemon read/write_iter grab a reference when picking op
and *always* give it up in the end
* don't put into hash until we know it's been successfully passed to
daemon
* move op->lock _lower_ than htab_in_progress_lock (and make sure
to take it in purge_inprogress_ops())
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
... otherwise some thread is running in .text that is about to
be freed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
If the program running dedupe receives a fatal signal during the
dedupe loop, we should bail out to avoid tying up the system.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Previously in DAX we assumed that calls to get_block() would set
bh.b_bdev, and we would then use that value even in error cases for
debugging. This caused a NULL pointer dereference in __dax_dbg() which
was fixed by a previous commit, but that commit only changed the one
place where we were hitting an error.
Instead, update dax.c so that we always initialize bh.b_bdev as best we
can based on the information that DAX has. get_block() may or may not
update to a new value, but this at least lets us get something helpful
from bh.b_bdev for error messages and not have to worry about whether it
was set by get_block() or not.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To properly handle fsync/msync in an efficient way DAX needs to track
dirty pages so it is able to flush them durably to media on demand.
The tracking of dirty pages is done via the radix tree in struct
address_space. This radix tree is already used by the page writeback
infrastructure for tracking dirty pages associated with an open file,
and it already has support for exceptional (non struct page*) entries.
We build upon these features to add exceptional entries to the radix
tree for DAX dirty PMD or PTE pages at fault time.
[dan.j.williams@intel.com: fix dax_pmd_dbg build warning]
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.com>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for tracking dirty DAX entries in the struct address_space
radix tree. This tree is already used for dirty page writeback, and it
already supports the use of exceptional (non struct page*) entries.
In order to properly track dirty DAX pages we will insert new
exceptional entries into the radix tree that represent dirty DAX PTE or
PMD pages. These exceptional entries will also contain the writeback
addresses for the PTE or PMD faults that we can use at fsync/msync time.
There are currently two types of exceptional entries (shmem and shadow)
that can be placed into the radix tree, and this adds a third. We rely
on the fact that only one type of exceptional entry can be found in a
given radix tree based on its usage. This happens for free with DAX vs
shmem but we explicitly prevent shadow entries from being added to radix
trees for DAX mappings.
The only shadow entries that would be generated for DAX radix trees
would be to track zero page mappings that were created for holes. These
pages would receive minimal benefit from having shadow entries, and the
choice to have only one type of exceptional entry in a given radix tree
makes the logic simpler both in clear_exceptional_entry() and in the
rest of DAX.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jan Kara <jack@suse.com>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When we get a DAX PMD fault for a write it is possible that there could
be some number of 4k zero pages already present for the same range that
were inserted to service reads from a hole. These 4k zero pages need to
be unmapped from the VMAs and removed from the struct address_space
radix tree before the real DAX PMD entry can be inserted.
For PTE faults this same use case also exists and is handled by a
combination of unmap_mapping_range() to unmap the VMAs and
delete_from_page_cache() to remove the page from the address_space radix
tree.
For PMD faults we do have a call to unmap_mapping_range() (protected by
a buffer_new() check), but nothing clears out the radix tree entry. The
buffer_new() check is also incorrect as the current ext4 and XFS
filesystem code will never return a buffer_head with BH_New set, even
when allocating new blocks over a hole. Instead the filesystem will
zero the blocks manually and return a buffer_head with only BH_Mapped
set.
Fix this situation by removing the buffer_new() check and adding a call
to truncate_inode_pages_range() to clear out the radix tree entries
before we insert the DAX PMD.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In __dax_pmd_fault() we currently assume that get_block() will always
set bh.b_bdev and we unconditionally dereference it in __dax_dbg().
This assumption isn't always true - when called for reads of holes
ext4_dax_mmap_get_block() returns a buffer head where bh->b_bdev is
never set. I hit this BUG while testing the DAX PMD fault path.
Instead, initialize bh.b_bdev before passing bh into get_block(). It is
possible that the filesystem's get_block() will update bh.b_bdev, and
this is fine - we just want to initialize bh.b_bdev to something
reasonable so that the calls to __dax_dbg() work and print something
useful.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).
Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull more btrfs updates from Chris Mason:
"These are mostly fixes that we've been testing, but also we grabbed
and tested a few small cleanups that had been on the list for a while.
Zhao Lei's patchset also fixes some early ENOSPC buglets"
* 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (21 commits)
btrfs: raid56: Use raid_write_end_io for scrub
btrfs: Remove unnecessary ClearPageUptodate for raid56
btrfs: use rbio->nr_pages to reduce calculation
btrfs: Use unified stripe_page's index calculation
btrfs: Fix calculation of rbio->dbitmap's size calculation
btrfs: Fix no_space in write and rm loop
btrfs: merge functions for wait snapshot creation
btrfs: delete unused argument in btrfs_copy_from_user
btrfs: Use direct way to determine raid56 write/recover mode
btrfs: Small cleanup for get index_srcdev loop
btrfs: Enhance chunk validation check
btrfs: Enhance super validation check
Btrfs: fix deadlock running delayed iputs at transaction commit time
Btrfs: fix typo in log message when starting a balance
btrfs: remove duplicate const specifier
btrfs: initialize the seq counter in struct btrfs_device
Btrfs: clean up an error code in btrfs_init_space_info()
btrfs: fix iterator with update error in backref.c
Btrfs: fix output of compression message in btrfs_parse_options()
Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots
...
Pull ext4 updates from Ted Ts'o:
"Some locking and page fault bug fixes from Jan Kara, some ext4
encryption fixes from me, and Li Xi's Project Quota commits"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
fs: clean up the flags definition in uapi/linux/fs.h
ext4: add FS_IOC_FSSETXATTR/FS_IOC_FSGETXATTR interface support
ext4: add project quota support
ext4: adds project ID support
ext4 crypto: simplify interfaces to directory entry insert functions
ext4 crypto: add missing locking for keyring_key access
ext4: use pre-zeroed blocks for DAX page faults
ext4: implement allocation of pre-zeroed blocks
ext4: provide ext4_issue_zeroout()
ext4: get rid of EXT4_GET_BLOCKS_NO_LOCK flag
ext4: document lock ordering
ext4: fix races of writeback with punch hole and zero range
ext4: fix races between buffered IO and collapse / insert range
ext4: move unlocked dio protection from ext4_alloc_file_blocks()
ext4: fix races between page faults and hole punching
Pull more xfs updates from Dave Chinner:
"This is the second update for XFS that I mentioned in the original
pull request last week.
It contains a revert for a suspend regression in 4.4 and a fix for a
long standing log recovery issue that has been further exposed by all
the log recovery changes made in the original 4.5 merge.
There is one more thing in this pull request - one that I forgot to
merge into the origin. That is, pulling the XFS_IOC_FS[GS]ETXATTR
ioctl up to the VFS level so that other filesystems can also use it
for modifying project quota IDs
Summary:
- promotion of XFS_IOC_FS[GS]ETXATTR ioctl to the vfs level so that
it can be shared with other filesystems. The ext4 project quota
functionality is the first target for this. The commits in this
series have not been updated with review or final SOB tags because
the branch they were originally published in was needed by ext4.
Those tags are:
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Dave Chinner <david@fromrobit.com>
- Revert a change that is causing suspend failures.
- Fix a use-after-free that can occur on log mount failures. Been
around forever, but now exposed by other changes to log recovery
made in the first 4.5 merge"
* tag 'xfs-for-linus-4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: log mount failures don't wait for buffers to be released
Revert "xfs: clear PF_NOFREEZE for xfsaild kthread"
xfs: introduce per-inode DAX enablement
xfs: use FS_XFLAG definitions directly
fs: XFS_IOC_FS[SG]SETXATTR to FS_IOC_FS[SG]ETXATTR promotion
Pull more vfs updates from Al Viro:
"Embarrassing braino fix + pipe page accounting + fixing an eyesore in
find_filesystem() (checking that s1 is equal to prefix of s2 of given
length can be done in many ways, but "compare strlen(s1) with length
and then do strncmp()" is not a good one...)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[regression] fix braino in fs/dlm/user.c
pipe: limit the per-user amount of pages allocated in pipes
find_filesystem(): simplify comparison