Commit Graph

5966 Commits

Author SHA1 Message Date
Linus Torvalds
cfb82e1df8 Merge tag 'y2038-vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground
Pull y2038 vfs updates from Arnd Bergmann:
 "Add inode timestamp clamping.

  This series from Deepa Dinamani adds a per-superblock minimum/maximum
  timestamp limit for a file system, and clamps timestamps as they are
  written, to avoid random behavior from integer overflow as well as
  having different time stamps on disk vs in memory.

  At mount time, a warning is now printed for any file system that can
  represent current timestamps but not future timestamps more than 30
  years into the future, similar to the arbitrary 30 year limit that was
  added to settimeofday().

  This was picked as a compromise to warn users to migrate to other file
  systems (e.g. ext4 instead of ext3) when they need the file system to
  survive beyond 2038 (or similar limits in other file systems), but not
  get in the way of normal usage"

* tag 'y2038-vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
  ext4: Reduce ext4 timestamp warnings
  isofs: Initialize filesystem timestamp ranges
  pstore: fs superblock limits
  fs: omfs: Initialize filesystem timestamp ranges
  fs: hpfs: Initialize filesystem timestamp ranges
  fs: ceph: Initialize filesystem timestamp ranges
  fs: sysv: Initialize filesystem timestamp ranges
  fs: affs: Initialize filesystem timestamp ranges
  fs: fat: Initialize filesystem timestamp ranges
  fs: cifs: Initialize filesystem timestamp ranges
  fs: nfs: Initialize filesystem timestamp ranges
  ext4: Initialize timestamps limits
  9p: Fill min and max timestamps in sb
  fs: Fill in max and min timestamps in superblock
  utimes: Clamp the timestamps before update
  mount: Add mount warning for impending timestamp expiry
  timestamp_truncate: Replace users of timespec64_trunc
  vfs: Add timestamp_truncate() api
  vfs: Add file timestamp range support
2019-09-19 09:42:37 -07:00
Linus Torvalds
53e5e7a7a7 Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs namei updates from Al Viro:
 "Pathwalk-related stuff"

[ Audit-related cleanups, misc simplifications, and easier to follow
  nd->root refcounts     - Linus ]

* 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  devpts_pty_kill(): don't bother with d_delete()
  infiniband: don't bother with d_delete()
  hypfs: don't bother with d_delete()
  fs/namei.c: keep track of nd->root refcount status
  fs/namei.c: new helper - legitimize_root()
  kill the last users of user_{path,lpath,path_dir}()
  namei.h: get the comments on LOOKUP_... in sync with reality
  kill LOOKUP_NO_EVAL, don't bother including namei.h from audit.h
  audit_inode(): switch to passing AUDIT_INODE_...
  filename_mountpoint(): make LOOKUP_NO_EVAL unconditional there
  filename_lookup(): audit_inode() argument is always 0
2019-09-18 13:03:01 -07:00
Trond Myklebust
eb3d8f4223 NFS: Fix inode fileid checks in attribute revalidation code
We want to throw out the attrbute if it refers to the mounted on fileid,
and not the real fileid. However we do not want to block cache consistency
updates from NFSv4 writes.

Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Fixes: 7e10cc25bf ("NFS: Don't refresh attributes with mounted-on-file...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-09-02 13:10:19 -04:00
Al Viro
fbb7d9d56d kill LOOKUP_NO_EVAL, don't bother including namei.h from audit.h
The former has no users left; the latter was only to get LOOKUP_...
values to remapper in audit_inode() and that's an ex-parrot now.

All places that use symbols from namei.h include it either directly
or (in a few cases) via a local header, like fs/autofs/autofs_i.h

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-08-30 21:29:32 -04:00
Deepa Dinamani
1fcb79c1b2 fs: nfs: Initialize filesystem timestamp ranges
Fill in the appropriate limits to avoid inconsistencies
in the vfs cached inode times when timestamps are
outside the permitted range.

The time formats for various verious is detailed in the
RFCs as below:

https://tools.ietf.org/html/rfc7862(time metadata)
https://tools.ietf.org/html/rfc7530:

nfstime4

   struct nfstime4 {
           int64_t         seconds;
           uint32_t        nseconds;
   };

https://tools.ietf.org/html/rfc1094

          struct timeval {
              unsigned int seconds;
              unsigned int useconds;
          };

https://tools.ietf.org/html/rfc1813

struct nfstime3 {
         uint32   seconds;
         uint32   nseconds;
      };

Use the limits as per the RFC.

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: trond.myklebust@hammerspace.com
Cc: anna.schumaker@netapp.com
Cc: linux-nfs@vger.kernel.org
2019-08-30 07:27:18 -07:00
YueHaibing
99300a8526 NFS: remove set but not used variable 'mapping'
Fixes gcc '-Wunused-but-set-variable' warning:

fs/nfs/write.c: In function nfs_page_async_flush:
fs/nfs/write.c:609:24: warning: variable mapping set but not used [-Wunused-but-set-variable]

It is not use since commit aefb623c422e ("NFS: Fix
writepage(s) error handling to not report errors twice")

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-27 10:24:56 -04:00
Trond Myklebust
d33d4beb52 NFSv2: Fix write regression
Ensure we update the write result count on success, since the
RPC call itself does not do so.

Reported-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
2019-08-27 10:24:56 -04:00
Trond Myklebust
71affe9be4 NFSv2: Fix eof handling
If we received a reply from the server with a zero length read and
no error, then that implies we are at eof.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-27 10:24:56 -04:00
Trond Myklebust
96c4145599 NFS: Fix writepage(s) error handling to not report errors twice
If writepage()/writepages() saw an error, but handled it without
reporting it, we should not be re-reporting that error on exit.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26 15:31:29 -04:00
Trond Myklebust
8f54c7a4ba NFS: Fix spurious EIO read errors
If the client attempts to read a page, but the read fails due to some
spurious error (e.g. an ACCESS error or a timeout, ...) then we need
to allow other processes to retry.
Also try to report errors correctly when doing a synchronous readpage.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26 15:31:29 -04:00
Trond Myklebust
7af46292da pNFS/flexfiles: Don't time out requests on hard mounts
If the mount is hard, we should ignore the 'io_maxretrans' module
parameter so that we always keep retrying.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26 15:31:29 -04:00
Trond Myklebust
d5711920ec Revert "NFSv4/flexfiles: Abort I/O early if the layout segment was invalidated"
This reverts commit a79f194aa4.
The mechanism for aborting I/O is racy, since we are not guaranteed that
the request is asleep while we're changing both task->tk_status and
task->tk_action.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v5.1
2019-08-26 15:31:29 -04:00
Trond Myklebust
bf2bf9b80e pNFS/flexfiles: Turn off soft RPC calls
The pNFS/flexfiles I/O requests are sent with the SOFTCONN flag set, so
they automatically time out if the connection breaks. It should
therefore not be necessary to have the soft flag set in addition.

Fixes: 5f01d95394 ("nfs41: create NFSv3 DS connection if specified")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-26 15:31:29 -04:00
Anna Schumaker
f836b27eca NFS: Have nfs4_proc_get_lease_time() call nfs4_call_sync_custom()
This removes some code duplication, since both functions were doing the
same thing.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-22 10:05:35 -04:00
Anna Schumaker
cc15e24a3a NFS: Have nfs41_proc_secinfo_no_name() call nfs4_call_sync_custom()
We need to use the custom rpc_task_setup here to set the
RPC_TASK_NO_ROUND_ROBIN flag on the RPC call.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-22 10:05:30 -04:00
Anna Schumaker
4c952e3d1b NFS: Have nfs41_proc_reclaim_complete() call nfs4_call_sync_custom()
An async call followed by an rpc_wait_for_completion() is basically the
same as a synchronous call, so we can use nfs4_call_sync_custom() to
keep our custom callback ops and the RPC_TASK_NO_ROUND_ROBIN flag.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-22 10:05:21 -04:00
Anna Schumaker
50493364e7 NFS: Have _nfs4_proc_secinfo() call nfs4_call_sync_custom()
We do this to set the RPC_TASK_NO_ROUND_ROBIN flag in the task_setup
structure

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-22 10:05:16 -04:00
Anna Schumaker
dae40965d5 NFS: Have nfs4_proc_setclientid() call nfs4_call_sync_custom()
Rather than running the task manually

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-22 10:05:11 -04:00
Anna Schumaker
48c058543c NFS: Add an nfs4_call_sync_custom() function
There are a few cases where we need to manually configure the
rpc_task_setup structure to get the behavior we want.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-22 10:05:06 -04:00
Wenwen Wang
1e672e3644 NFSv4: Fix a memory leak bug
In nfs4_try_migration(), if nfs4_begin_drain_session() fails, the
previously allocated 'page' and 'locations' are not deallocated, leading to
memory leaks. To fix this issue, go to the 'out' label to free 'page' and
'locations' before returning the error.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-21 16:39:29 -04:00
Jia-Ju Bai
e2751463ea fs: nfs: Fix possible null-pointer dereferences in encode_attrs()
In encode_attrs(), there is an if statement on line 1145 to check
whether label is NULL:
    if (label && (attrmask[2] & FATTR4_WORD2_SECURITY_LABEL))

When label is NULL, it is used on lines 1178-1181:
    *p++ = cpu_to_be32(label->lfs);
    *p++ = cpu_to_be32(label->pi);
    *p++ = cpu_to_be32(label->len);
    p = xdr_encode_opaque_fixed(p, label->label, label->len);

To fix these bugs, label is checked before being used.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2019-08-20 09:30:50 -04:00
Trond Myklebust
06c9fdf3b9 NFS: On fatal writeback errors, we need to call nfs_inode_remove_request()
If the writeback error is fatal, we need to remove the tracking structures
(i.e. the nfs_page) from the inode.

Fixes: 6fbda89b25 ("NFS: Replace custom error reporting mechanism...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-19 08:56:04 -04:00
Trond Myklebust
17d8c5d145 NFS: Fix initialisation of I/O result struct in nfs_pgio_rpcsetup
Initialise the result count to 0 rather than initialising it to the
argument count. The reason is that we want to ensure we record the
I/O stats correctly in the case where an error is returned (for
instance in the layoutstats).

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-19 08:56:04 -04:00
Trond Myklebust
eb2c50da9e NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0
If the attempt to resend the I/O results in no bytes being read/written,
we must ensure that we report the error.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Fixes: 0a00b77b33 ("nfs: mirroring support for direct io")
Cc: stable@vger.kernel.org # v3.20+
2019-08-19 08:56:04 -04:00
Trond Myklebust
f4340e9314 NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend()
If the attempt to resend the pages fails, we need to ensure that we
clean up those pages that were not transmitted.

Fixes: d600ad1f2b ("NFS41: pop some layoutget errors to application")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.5+
2019-08-19 08:56:04 -04:00
Trond Myklebust
9821421a29 NFSv4: Fix return value in nfs_finish_open()
If the file turns out to be of the wrong type after opening, we want
to revalidate the path and retry, so return EOPENSTALE rather than
ESTALE.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-19 08:56:04 -04:00
Trond Myklebust
90cf500e33 NFSv4: Fix return values for nfs4_file_open()
Currently, we are translating RPC level errors such as timeouts,
as well as interrupts etc into EOPENSTALE, which forces a single
replay of the open attempt. What we actually want to do is
force the replay only in the cases where the returned error
indicates that the file may have changed on the server.

So the fix is to spell out the exact set of errors where we want
to return EOPENSTALE.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-19 08:56:04 -04:00
Trond Myklebust
7e10cc25bf NFS: Don't refresh attributes with mounted-on-file information
If we've been given the attributes of the mounted-on-file, then do not
use those to check or update the attributes on the application-visible
inode.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-19 08:56:04 -04:00
Trond Myklebust
67e7b52d44 NFSv4: Ensure state recovery handles ETIMEDOUT correctly
Ensure that the state recovery code handles ETIMEDOUT correctly,
and also that we set RPC_TASK_TIMEOUT when recovering open state.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-07 12:55:11 -04:00
Trond Myklebust
dea1bb35c5 NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts
People are reporing seeing fscache errors being reported concerning
duplicate cookies even in cases where they are not setting up fscache
at all. The rule needs to be that if fscache is not enabled, then it
should have no side effects at all.

To ensure this is the case, we disable fscache completely on all superblocks
for which the 'fsc' mount option was not set. In order to avoid issues
with '-oremount', we also disable the ability to turn fscache on via
remount.

Fixes: f1fe29b4a0 ("NFS: Use i_writecount to control whether...")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200145
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Steve Dickson <steved@redhat.com>
Cc: David Howells <dhowells@redhat.com>
2019-08-04 22:35:41 -04:00
Trond Myklebust
09a54f0ebf NFSv4: Fix an Oops in nfs4_do_setattr
If the user specifies an open mode of 3, then we don't have a NFSv4 state
attached to the context, and so we Oops when we try to dereference it.

Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 29b59f9416 ("NFSv4: change nfs4_do_setattr to take...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.10: 991eedb137: NFSv4: Only pass the...
Cc: stable@vger.kernel.org # v4.10+
2019-08-04 22:35:41 -04:00
Trond Myklebust
c77e22834a NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
John Hubbard reports seeing the following stack trace:

nfs4_do_reclaim
   rcu_read_lock /* we are now in_atomic() and must not sleep */
       nfs4_purge_state_owners
           nfs4_free_state_owner
               nfs4_destroy_seqid_counter
                   rpc_destroy_wait_queue
                       cancel_delayed_work_sync
                           __cancel_work_timer
                               __flush_work
                                   start_flush_work
                                       might_sleep:
                                        (kernel/workqueue.c:2975: BUG)

The solution is to separate out the freeing of the state owners
from nfs4_purge_state_owners(), and perform that outside the atomic
context.

Reported-by: John Hubbard <jhubbard@nvidia.com>
Fixes: 0aaaf5c424 ("NFS: Cache state owners after files are closed")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Trond Myklebust
e3c8dc761e NFSv4: Check the return value of update_open_stateid()
Ensure that we always check the return value of update_open_stateid()
so that we can retry if the update of local state failed. This fixes
infinite looping on state recovery.

Fixes: e23008ec81 ("NFSv4 reduce attribute requests for open reclaim")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v3.7+
2019-08-04 22:35:40 -04:00
Trond Myklebust
ad11408970 NFSv4.1: Only reap expired delegations
Fix nfs_reap_expired_delegations() to ensure that we only reap delegations
that are actually expired, rather than triggering on random errors.

Fixes: 45870d6909 ("NFSv4.1: Test delegation stateids when server...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Trond Myklebust
27a30cf64a NFSv4.1: Fix open stateid recovery
The logic for checking in nfs41_check_open_stateid() whether the state
is supported by a delegation is inverted. In addition, it makes more
sense to perform that check before we check for expired locks.

Fixes: 8a64c4ef10 ("NFSv4.1: Even if the stateid is OK,...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Trond Myklebust
731c74dd98 NFSv4: Report the error from nfs4_select_rw_stateid()
In pnfs_update_layout() ensure that we do report any fatal errors from
nfs4_select_rw_stateid().

Fixes: d9aba2b40d ("NFSv4: Don't use the zero stateid with layoutget")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Trond Myklebust
c34fae003c NFSv4: When recovering state fails with EAGAIN, retry the same recovery
If the server returns with EAGAIN when we're trying to recover from
a server reboot, we currently delay for 1 second, but then mark the
stateid as needing recovery after the grace period has expired.

Instead, we should just retry the same recovery process immediately
after the 1 second delay. Break out of the loop after 10 retries.

Fixes: 35a61606a6 ("NFS: Reduce indentation of the switch statement...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Trond Myklebust
86dbd08b32 NFSv4: Print an error in the syslog when state is marked as irrecoverable
When error recovery fails due to a fatal error on the server, ensure
we log it in the syslog.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Trond Myklebust
5eb8d18ca0 NFSv4: Fix delegation state recovery
Once we clear the NFS_DELEGATED_STATE flag, we're telling
nfs_delegation_claim_opens() that we're done recovering all open state
for that stateid, so we really need to ensure that we test for all
open modes that are currently cached and recover them before exiting
nfs4_open_delegation_recall().

Fixes: 24311f8841 ("NFSv4: Recovery of recalled read delegations...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.3+
2019-08-04 22:35:40 -04:00
Trond Myklebust
8c39a39e28 NFSv4: Fix a credential refcount leak in nfs41_check_delegation_stateid
It is unsafe to dereference delegation outside the rcu lock, and in
any case, the refcount is guaranteed held if cred is non-zero.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-08-04 22:35:40 -04:00
Linus Torvalds
18253e034d Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull dcache and mountpoint updates from Al Viro:
 "Saner handling of refcounts to mountpoints.

  Transfer the counting reference from struct mount ->mnt_mountpoint
  over to struct mountpoint ->m_dentry. That allows us to get rid of the
  convoluted games with ordering of mount shutdowns.

  The cost is in teaching shrink_dcache_{parent,for_umount} to cope with
  mixed-filesystem shrink lists, which we'll also need for the Slab
  Movable Objects patchset"

* 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  switch the remnants of releasing the mountpoint away from fs_pin
  get rid of detach_mnt()
  make struct mountpoint bear the dentry reference to mountpoint, not struct mount
  Teach shrink_dcache_parent() to cope with mixed-filesystem shrink lists
  fs/namespace.c: shift put_mountpoint() to callers of unhash_mnt()
  __detach_mounts(): lookup_mountpoint() can't return ERR_PTR() anymore
  nfs: dget_parent() never returns NULL
  ceph: don't open-code the check for dead lockref
2019-07-20 09:15:51 -07:00
Linus Torvalds
6860c981b9 Merge tag 'nfs-for-5.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable fixes:

   - SUNRPC: Ensure bvecs are re-synced when we re-encode the RPC
     request

   - Fix an Oops in ff_layout_track_ds_error due to a PTR_ERR()
     dereference

   - Revert buggy NFS readdirplus optimisation

   - NFSv4: Handle the special Linux file open access mode

   - pnfs: Fix a problem where we gratuitously start doing I/O through
     the MDS

  Features:

   - Allow NFS client to set up multiple TCP connections to the server
     using a new 'nconnect=X' mount option. Queue length is used to
     balance load.

   - Enhance statistics reporting to report on all transports when using
     multiple connections.

   - Speed up SUNRPC by removing bh-safe spinlocks

   - Add a mechanism to allow NFSv4 to request that containers set a
     unique per-host identifier for when the hostname is not set.

   - Ensure NFSv4 updates the lease_time after a clientid update

  Bugfixes and cleanup:

   - Fix use-after-free in rpcrdma_post_recvs

   - Fix a memory leak when nfs_match_client() is interrupted

   - Fix buggy file access checking in NFSv4 open for execute

   - disable unsupported client side deduplication

   - Fix spurious client disconnections

   - Fix occasional RDMA transport deadlock

   - Various RDMA cleanups

   - Various tracepoint fixes

   - Fix the TCP callback channel to guarantee the server can actually
     send the number of callback requests that was negotiated at mount
     time"

* tag 'nfs-for-5.3-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (68 commits)
  pnfs/flexfiles: Add tracepoints for detecting pnfs fallback to MDS
  pnfs: Fix a problem where we gratuitously start doing I/O through the MDS
  SUNRPC: Optimise transport balancing code
  SUNRPC: Ensure the bvecs are reset when we re-encode the RPC request
  pnfs/flexfiles: Fix PTR_ERR() dereferences in ff_layout_track_ds_error
  NFSv4: Don't use the zero stateid with layoutget
  SUNRPC: Fix up backchannel slot table accounting
  SUNRPC: Fix initialisation of struct rpc_xprt_switch
  SUNRPC: Skip zero-refcount transports
  SUNRPC: Replace division by multiplication in calculation of queue length
  NFSv4: Validate the stateid before applying it to state recovery
  nfs4.0: Refetch lease_time after clientid update
  nfs4: Rename nfs41_setup_state_renewal
  nfs4: Make nfs4_proc_get_lease_time available for nfs4.0
  nfs: Fix copy-and-paste error in debug message
  NFS: Replace 16 seq_printf() calls by seq_puts()
  NFS: Use seq_putc() in nfs_show_stats()
  Revert "NFS: readdirplus optimization by cache mechanism" (memleak)
  SUNRPC: Fix transport accounting when caller specifies an rpc_xprt
  NFS: Record task, client ID, and XID in xdr_status trace points
  ...
2019-07-18 14:32:33 -07:00
Trond Myklebust
d5b9216fd5 pnfs/flexfiles: Add tracepoints for detecting pnfs fallback to MDS
Add tracepoints to allow debugging of the event chain leading to
a pnfs fallback to doing I/O through the MDS.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-18 15:50:28 -04:00
Trond Myklebust
58bbeab425 pnfs: Fix a problem where we gratuitously start doing I/O through the MDS
If the client has to stop in pnfs_update_layout() to wait for another
layoutget to complete, it currently exits and defaults to I/O through
the MDS if the layoutget was successful.

Fixes: d03360aaf5 ("pNFS: Ensure we return the error if someone kills...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+
2019-07-18 15:33:42 -04:00
Trond Myklebust
8e04fdfadd pnfs/flexfiles: Fix PTR_ERR() dereferences in ff_layout_track_ds_error
mirror->mirror_ds can be NULL if uninitialised, but can contain
a PTR_ERR() if call to GETDEVICEINFO failed.

Fixes: 65990d1afb ("pNFS/flexfiles: Fix a deadlock on LAYOUTGET")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # 4.10+
2019-07-18 14:43:52 -04:00
Trond Myklebust
d9aba2b40d NFSv4: Don't use the zero stateid with layoutget
The NFSv4.1 protocol explicitly forbids us from using the zero stateid
together with layoutget, so when we see that nfs4_select_rw_stateid()
is unable to return a valid delegation, lock or open stateid, then
we should initiate recovery and retry.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-18 14:43:52 -04:00
Trond Myklebust
7402a4fedc SUNRPC: Fix up backchannel slot table accounting
Add a per-transport maximum limit in the socket case, and add
helpers to allow the NFSv4 code to discover that limit.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-18 01:12:59 -04:00
Trond Myklebust
50c8000744 NFSv4: Validate the stateid before applying it to state recovery
If the stateid is the zero or invalid stateid, then it is pointless
to attempt to use it for recovery. In that case, try to fall back
to using the open state stateid, or just doing a general recovery
of all state on a given inode.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-15 10:11:20 -04:00
Donald Buczek
5b596830d9 nfs4.0: Refetch lease_time after clientid update
RFC 7530 requires us to refetch the lease time attribute once a new
clientID is established. This is already implemented for the
nfs4.1(+) clients by nfs41_init_clientid, which calls
nfs41_finish_session_reset, which calls nfs4_setup_state_renewal.

To make nfs4_setup_state_renewal available for nfs4.0, move it
further to the top of the source file to include it regardles of
CONFIG_NFS_V4_1 and to save a forward declaration.

Call nfs4_setup_state_renewal from nfs4_init_clientid.

Signed-off-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-13 11:48:41 -04:00
Donald Buczek
ea51efaa96 nfs4: Rename nfs41_setup_state_renewal
The function nfs41_setup_state_renewal is useful to the nfs 4.0 client
as well, so rename the function to nfs4_setup_state_renewal.

Signed-off-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2019-07-13 11:48:41 -04:00