Commit Graph

3400 Commits

Author SHA1 Message Date
Chuck Lever
0cdd80d07f NFS: remove support for multi-segment iovs in the direct read path
Eliminate the persistent use of automatic storage in all parts of the NFS
client's direct read path to pave the way for introducing support for aio
against files opened with the O_DIRECT flag.

Test plan:
Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled.
Millions of fsx-odirect ops.  OraSim.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:29 -05:00
Chuck Lever
5dd602f206 NFS: use size_t type for holding rsize bytes in NFS O_DIRECT read path
size_t is used for holding byte counts, so use it for variables storing rsize.
Note that the write path will be updated as we add support for async
O_DIRECT writes.

Test plan:
Need to verify that existing comparisons against new size_t variables behave
correctly.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:29 -05:00
Chuck Lever
d4cc948ba9 NFS: update comments and function definitions in fs/nfs/direct.c
Update to latest coding style standards.  Remove block comments on
statically defined functions, and place function definitions all on
one line.

Test plan:
Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:28 -05:00
Chuck Lever
b8a32e2b8b NFS: clean up NFS client's a_ops->direct_IO method
The NFS client's a_ops->direct_IO method, nfs_direct_IO, is required to
be present to allow NFS files to be opened with O_DIRECT, but is never
called because the NFS client shunts reads and writes to files opened
with O_DIRECT directly to its own routines.

Gut the nfs_direct_IO function.  This eliminates the only part of the
NFS client's direct I/O path that requires support for multi-segment
iovs, allowing further simplification in subsequent patches.

Test plan:
Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled.  Millions
of fsx-odirect ops.  OraSim.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:28 -05:00
Trond Myklebust
ec06c096ed NFS: Cleanup of NFS read code
Same callback hierarchy inversion as for the NFS write calls. This patch is
not strictly speaking needed by the O_DIRECT code, but avoids confusing
differences between the asynchronous read and write code.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:27 -05:00
Trond Myklebust
788e7a89a0 NFS: Cleanup of NFS write code in preparation for asynchronous o_direct
This patch inverts the callback hierarchy for NFS write calls.

Instead of having the NFSv2/v3/v4-specific code set up the RPC callback
ops, we allow the original caller to do so. This allows for more
flexibility w.r.t. how to set up and tear down the nfs_write_data
structure while still allowing the NFSv3/v4 code to perform error
handling.

The greater flexibility is needed by the asynchronous O_DIRECT code, which
wants to be able to hold on to the original nfs_write_data structures after
the WRITE RPC call has completed in order to be able to replay them if the
COMMIT call determines that the server has rebooted.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:27 -05:00
J. Bruce Fields
7117bf3dfb lockd: Remove FL_LOCKD flag
Currently lockd identifies its own locks using the FL_LOCKD flag.  This
doesn't scale well to multiple lock managers--if we did this in nfsv4 too,
for example, we'd be left with only one free flag bit.

Instead, we just check whether the file manager ops (fl_lmops) set on this
lock are our own.

The only use for this is in nlm_traverse_locks, which uses it to find locks
that need cleaning up when freeing a host or a file.

In the long run it might be nice to do reference counting instead of
traversing all the locks like this....

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:26 -05:00
Andy Adamson
8dc7c3115b locks,lockd: fix race in nlmsvc_testlock
posix_test_lock() returns a pointer to a struct file_lock which is unprotected
and can be removed while in use by the caller.  Move the conflicting lock from
the return to a parameter, and copy the conflicting lock.

In most cases the caller ends up putting the copy of the conflicting lock on
the stack.  On i386, sizeof(struct file_lock) appears to be about 100 bytes.
We're assuming that's reasonable.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:26 -05:00
Andy Adamson
2e0af86f61 locks: remove unused posix_block_lock
posix_lock_file() is used to add a blocked lock to Lockd's block, so
posix_block_lock() is no longer needed.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:26 -05:00
Andy Adamson
a85f193e2f lockd: make nlmsvc_lock use only posix_lock_file
Reorganize nlmsvc_lock() to make full use of posix_lock_file(), which does
eveything nlmsvc_lock() needs - no need to call posix_test_lock(),
posix_locks_deadlock(), or posix_block_lock() separately.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:25 -05:00
Andy Adamson
5de0e5024a lockd: simplify nlmsvc_grant_blocked
Reorganize nlmsvc_grant_blocked() to make full use of posix_lock_file().  Note
that there's no need for separate calls to posix_test_lock(),
posix_locks_deadlock(), or posix_block_lock().

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:25 -05:00
Andy Adamson
15dadef946 lockd: clean up nlmsvc_lock
Slightly more consistent dprintk error reporting, consolidate some up()'s.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:24 -05:00
Chuck Lever
1e7cb3dc12 NFS: directory trace messages
Reuse NFSDBG_DIRCACHE and NFSDBG_LOOKUPCACHE to provide additional
diagnostic messages that trace the operation of the NFS client's
directory behavior.  A few new messages are now generated when NFSDBG_VFS
is active, as well, to trace normal VFS activity.  This compromise
provides better trace debugging for those who use pre-built kernels,
without adding a lot of extra noise to the standard debug settings.

Test-plan:
Enable NFS trace debugging with flags 1, 2, or 4.  You should be able to
see different types of trace messages with each flag setting.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:24 -05:00
Chuck Lever
dead28da8e SUNRPC: eliminate rpc_call()
Clean-up: replace rpc_call() helper with direct call to rpc_call_sync.

This makes NFSv2 and NFSv3 synchronous calls more computationally
efficient, and reduces stack consumption in functions that used to
invoke rpc_call more than once.

Test plan:
Compile kernel with CONFIG_NFS enabled.  Connectathon on NFS version 2,
version 3, and version 4 mount points.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:23 -05:00
Chuck Lever
cc0175c1dc SUNRPC: display human-readable procedure name in rpc_iostats output
Add fields to the rpc_procinfo struct that allow the display of a
human-readable name for each procedure in the rpc_iostats output.

Also fix it so that the NFSv4 stats are broken up correctly by
sub-procedure number.  NFSv4 uses only two real RPC procedures:
NULL, and COMPOUND.

Test plan:
Mount with NFSv2, NFSv3, and NFSv4, and do "cat /proc/self/mountstats".

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:22 -05:00
Chuck Lever
4ece3a2d18 NFS: add RPC I/O statistics to /proc/self/mountstats
NFS client now shows various RPC I/O metrics in /proc/self/mountstats.

Test plan:
Mount/umount while doing "cat /proc/self/mountstats", multiple iterations
of connectathon locking suite.  Test with NFS version 2, 3, and 4.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:22 -05:00
Chuck Lever
67ec9f46b8 NFS: report how long an NFS file system has been mounted
Add a field in nfs_server to record a timestamp when a mount succeeds.
Report the number of seconds the file system has been mounted via
nfs_show_stats().

Test plan:
Mount an NFS file system, watch the mountstats reports and compare with
clock time.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:15 -05:00
Chuck Lever
006ea73e5f NFS: add hooks to account for NFSERR_JUKEBOX errors
Make an inode or an nfs_server struct available in the logic that handles
JUKEBOX/DELAY type errors so the NFS client can account for them.

This patch is split out from the main nfs iostat patch to highlight minor
architectural changes required to support this statistic.

Test plan:
None.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:14 -05:00
Chuck Lever
91d5b47023 NFS: add I/O performance counters
Invoke the byte and event counter macros where we want to count bytes and
events.

Clean-up: fix a possible NULL dereference in nfs_lock, and simplify
nfs_file_open.

Test-plan:
fsx and iozone on UP and SMP systems, with and without pre-emption.  Watch
for memory overwrite bugs, and performance loss (significantly more CPU
required per op).

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:14 -05:00
Chuck Lever
d9ef5a8c26 NFS: introduce mechanism for tracking NFS client metrics
Add a per-superblock performance counter facility to the NFS client.  This
facility mimics the counters available for block devices and for
networking.  Expose these new counters via the new /proc/self/mountstats
interface.

Thanks to Andrew Morton and Trond Myklebust for their review and comments.

Test plan:
fsx and iozone on UP and SMP systems, with and without pre-emption.  Watch
for memory overwrite bugs, and performance loss (significantly more CPU
required per op).

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:13 -05:00
Chuck Lever
c8bded96aa NFS: clean up some mount options
Get rid of "lock" and "posix", and spell out "vers=".

Test plan:
None.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:13 -05:00
Chuck Lever
7a480e250c NFS: show retransmit settings when displaying mount options
Sometimes it's important to know the exact RPC retransmit settings the
kernel is using for an NFS mount point.  Add this facility to the NFS
client's show_options method.

Test plan:
Set various retransmit settings via the mount command, and check that the
settings are reflected in /proc/mounts.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:12 -05:00
Chuck Lever
b4629fe2f0 VFS: New /proc file /proc/self/mountstats
Create a new file under /proc/self, called mountstats, where mounted file
systems can export information (configuration options, performance counters,
and so on).  Use a mechanism similar to /proc/mounts and s_ops->show_options.

This mechanism does not violate namespace security, and is safe to use while
other processes are unmounting file systems.

Thanks to Mike Waychison for his review and comments.

Test-plan:
Test concurrent mount/unmount operations while cat'ing /proc/self/mountstats.

Signed-off-by: Chuck Lever <cel@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:12 -05:00
Ingo Molnar
c9d5128a10 NFS: sem2mutex idmap.c
semaphore to mutex conversion.

the conversion was generated via scripts, and the result was validated
automatically via a script as well.

build and boot tested.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:11 -05:00
Eric Sesterhenn
bd6475454c NFS: kzalloc conversion in fs/nfs
this converts fs/nfs to kzalloc() usage.
compile tested with make allyesconfig

Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:10 -05:00
Trond Myklebust
a162a6b804 NFSv4: Kill braindead gcc warnings
nfs4_open_revalidate: 'res' may be used uninitialized
nfs4_callback_compound: ‘hdr_res.nops’ may be used uninitialized
			'op_nr’ may be used uninitialized
encode_getattr_res: ‘savep’ may be used uninitialized

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:10 -05:00
Trond Myklebust
967b928136 NFSv4: Do not call rpciod_down() before call to destroy_nfsv4_state()
The reason is that the idmapper cleanup may call flush_workqueue() on
rpciod_workqueue.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:09 -05:00
Trond Myklebust
12de3b35ea SUNRPC: Ensure that rpc_mkpipe returns a refcounted dentry
If not, we cannot guarantee that idmap->idmap_dentry, gss_auth->dentry and
clnt->cl_dentry are valid dentries.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:09 -05:00
Trond Myklebust
fb374d24f2 NFS: reduce the number of false cache invalidations.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:08 -05:00
Jesper Juhl
c8d149f3db NFS: "const static" vs "static const" in nfs4
My previous "const static" vs "static const" cleanup missed a single case,
patch below takes care of it.

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:07 -05:00
Trond Myklebust
ca62b9c3f7 NFSv4: Don't invalidate cached attributes if change attribute is unchanged
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:07 -05:00
Trond Myklebust
755c1e20cd NFS: writes should not clobber utimes() calls
Ensure that we flush out writes in the case when someone calls utimes() in
order to set the file times.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:06 -05:00
Trond Myklebust
7bab377fcb lockd: Don't expose the process pid to the NLM server
Instead we use the nlm_lockowner->pid.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:06 -05:00
Trond Myklebust
36943fa4b2 NLM: nlm_alloc_call should not immediately fail on signal
Currently, nlm_alloc_call tests for a signal before it even tries to
allocate memory.
Fix it so that it tries at least once.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:05 -05:00
Trond Myklebust
47831f35b8 VFS: Fix __posix_lock_file() copy of private lock area
The struct file_lock->fl_u area must be copied using the fl_copy_lock()
operation.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:05 -05:00
Neil Brown
1dd594b21b NFS: Fix buglet in fs/nfs/write.c
I've been reading through fs/nfs/write.c trying to track down a bug
that seems to be related to pages loosing a refcount and getting
freed too early (you interested in detail??) and I spotted a little
bug which the following patch should fix.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:04 -05:00
Trond Myklebust
cd52ed3553 NFS: Avoid races between writebacks and truncation
Currently, there is no serialisation between NFS asynchronous writebacks
and truncation at the page level due to the fact that nfs_sync_inode()
cannot lock the pages that it is about to write out.

This means that it is possible to be flushing out data (and calling something
like set_page_writeback()) while the page cache is busy evicting the page.
Oops...

Use the hooks provided in try_to_release_page() to ensure that dirty pages
are always written back to storage before we evict them.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:04 -05:00
Trond Myklebust
b92dccf65b NFS: Fix a busy inodes issue...
The nfs_open_context may live longer than the file descriptor that spawned
it, so it needs to carry a reference to the vfsmount. If not, then
generic_shutdown_super() may end up being called before reads and writes
have been flushed out.

Make a couple of functions static while we're at it...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-03-20 13:44:03 -05:00
Linus Torvalds
88dcb91177 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shaggy/jfs-2.6:
  JFS: add uid, gid, and umask mount options
  JFS: Take logsync lock before testing mp->lsn
  JFS: kzalloc conversion
  JFS: Add missing file from fa3241d24c
  JFS: Use the kthread_ API
  JFS: Fix regression.  fsck complains if symlinks do not have INLINEEA attribute
  JFS: ext2 inode attributes for jfs
  JFS: semaphore to mutex conversion.
  JFS: make buddy table static
  JFS: Add back directory i_size calculations for legacy partitions
2006-03-20 10:32:33 -08:00
Steve French
fd4a0b92db Merge with /pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Signed-off-by: Steve French <sfrench@us.ibm.com>
2006-03-20 16:58:09 +00:00
Nathan Scott
6cc8fef4cb [XFS] Fix compiler warning from xfs_file_compat_invis_ioctl prototype.
SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25509a

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-20 13:25:48 +11:00
Peter Staubach
85c6932ef0 [PATCH] nfsservctl(): remove user-triggerable printk
A user can use nfsservctl() to spam the logs.

This can happen because the arguments to the nfsservctl() system call are
versioned.  This is a good thing.  However, when a bad version is detected,
the kernel prints a message and then returns an error.

Signed-off-by: Peter Staubach <staubach@redhat.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-17 07:51:25 -08:00
Eric Van Hensbergen
8532159f55 [PATCH] v9fs: fix overzealous dropping of dentry which breaks dcache
There is a d_drop in dir_release which caused problems as it invalidates
dcache entries too soon.  This was likely a part of the wierd cwd behavior
folks were seeing.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-17 07:51:25 -08:00
Nathan Scott
b2fc6ad01b [XFS] remove bogus INT_GET for u8 variables in xfs_dir_leaf.c
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25506a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-17 17:30:01 +11:00
Nathan Scott
fac80cce0e [XFS] endianess annotations for xfs_da_node_hdr_t
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25505a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-17 17:29:56 +11:00
Nathan Scott
403432dcb5 [XFS] endianess annotations for xfs_da_node_entry_t
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25504a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-17 17:29:46 +11:00
Nathan Scott
d7929ff670 [XFS] store xfs_attr_inactive_list_t in native endian
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25503a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-17 17:29:36 +11:00
Nathan Scott
984a081a7c [XFS] store xfs_attr_sf_sort in native endian
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25502a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-17 17:29:31 +11:00
Nathan Scott
3b244aa81e [XFS] endianess annotations for xfs_attr_shortform_t
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25501a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-17 17:29:25 +11:00
Nathan Scott
c0f054e7a4 [XFS] endianess annotations for xfs_attr_leaf_name_remote_t
SGI-PV: 943272
SGI-Modid: xfs-linux-melb:xfs-kern:25500a

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nathan Scott <nathans@sgi.com>
2006-03-17 17:29:18 +11:00