Commit Graph

15761 Commits

Author SHA1 Message Date
Trond Myklebust
d716f0b8a5 SUNRPC: nfsacl_encode/nfsacl_decode should be exported as GPL-only
Again, this has never been intended as a public abi for out-of-tree
modules.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:32 -05:00
Wu Fengguang
136221fc32 nfs: remove redundant tests on reading new pages
aops->readpages() and its NFS helper readpage_async_filler() will only
be called to do readahead I/O for newly allocated pages. So it's not
necessary to test for the always 0 dirty/uptodate page flags.

The removal of nfs_wb_page() call also fixes a readahead bug: the NFS
readahead has been synchronous since 2.6.23, because that call will
clear PG_readahead, which is the reminder for asynchronous readahead.

More background: the PG_readahead page flag is shared with PG_reclaim,
one for read path and the other for write path. clear_page_dirty_for_io()
unconditionally clears PG_readahead to prevent possible readahead residuals,
assuming itself to be always called in the write path. However, NFS is one
and the only exception in that it _always_ calls clear_page_dirty_for_io()
in the read path, i.e. for readpages()/readpage().

Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-12-23 15:21:30 -05:00
Andrew Morton
722d74219e dlm: fs/dlm/ast.c: fix warning
fs/dlm/ast.c: In function 'dlm_astd':
fs/dlm/ast.c:64: warning: 'bastmode' may be used uninitialized in this function

Cleans code up.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:22:56 -06:00
David Teigland
d022509d1c dlm: add new debugfs entry
The new debugfs entry dumps all rsb and lkb structures, and includes
a lot more information than has been available before.  This includes
the new timestamps added by a previous patch for debugging callback
issues.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:18:51 -06:00
David Teigland
e3a84ad495 dlm: add time stamp of blocking callback
Record the time the latest blocking callback was queued for
a lock.  This will be used for debugging in combination with
lock queue timestamp changes in the previous patch.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:18:34 -06:00
David Teigland
eeda418d8c dlm: change lock time stamping
Use ktime instead of jiffies for timestamping lkb's.  Also stamp the
time on every lkb whenever it's added to a resource queue, instead of
just stamping locks subject to timeouts.  This will allow us to use
timestamps more widely for debugging all locks.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:18:17 -06:00
David Teigland
fd22a51bcc dlm: improve how bast mode handling
The lkb bastmode value is set in the context of processing the
lock, and read by the dlm_astd thread.  Because it's accessed
in these two separate contexts, the writing/reading ought to
be done under a lock.  This is simple to do by setting it and
reading it when the lkb is added to and removed from dlm_astd's
callback list which is properly locked.

Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:16:46 -06:00
David Teigland
0333969631 dlm: remove extra blocking callback check
Just before delivering a blocking callback (bast), the dlm_astd
thread checks again that the granted mode of the lkb actually
blocks the mode requested by the bast.  The idea behind this was
originally that the granted mode may have changed since the bast
was queued, making the callback now unnecessary.  Reasons for
removing this extra check are:
- dlm_astd doesn't lock the rsb before reading the lkb grmode, so
  it's not technically safe (this removes the long standing FIXME)
- after running some tests, it doesn't appear the check ever actually
  eliminates a bast
- delivering an unnecessary blocking callback isn't a bad thing and
  can happen anyway

Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:16:32 -06:00
Steven Whitehouse
d61e9aac96 dlm: replace schedule with cond_resched
This is a one-liner to use cond_resched() rather than schedule()
in the ast delivery loop. It should not be necessary to schedule
every time, so this will save some cpu time while continuing to
allow scheduling when required.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:16:13 -06:00
Steven Whitehouse
1521848cbb dlm: remove kmap/kunmap
The pages used in lowcomms are not highmem, so kmap is not necessary.

Cc: Christine Caulfield <ccaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:16:01 -06:00
Harvey Harrison
cd8e4679bd dlm: trivial annotation of be16 value
fs/dlm/dir.c:419:14: warning: incorrect type in assignment (different base types)
fs/dlm/dir.c:419:14:    expected unsigned short [unsigned] [addressable] [assigned] [usertype] be_namelen
fs/dlm/dir.c:419:14:    got restricted __be16 [usertype] <noident>

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:15:51 -06:00
Steven Whitehouse
d6d7b702a3 dlm: fix up memory allocation flags
Use ls_allocation for memory allocations, which a cluster fs sets to
GFP_NOFS.  Use GFP_NOFS for allocations when no lockspace struct is
available.  Taking dlm locks needs to avoid calling back into the
cluster fs because write-out can require taking dlm locks.

Cc: Christine Caulfield <ccaulfie@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
2008-12-23 10:15:40 -06:00
Artem Bityutskiy
c8f915913a UBIFS: avoid unnecessary calculations
Do not calculate min_idx_lebs, because it is available in
c->min_idx_lebs

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:24:16 +02:00
Artem Bityutskiy
650ed50f42 UBIFS: re-calculate min_idx_size after the commit
When we commit, but before we try to write anything to the flash
media, @c->min_idx_size is inaccurate, because we do not re-calculate
it after the commit. Do not forget to do this.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:24:05 +02:00
Artem Bityutskiy
4d61db4f87 UBIFS: use nicer 64-bit math
Instead of using do_div(), use better primitives from
linux/math64.h.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:23:40 +02:00
Artem Bityutskiy
af14a1ad79 UBIFS: fix available blocks count
Take into account that 2 eraseblocks are never available because
they are reserved for the index. This gives more realistic count
of FS blocks.

To avoid future confusions like this, introduce a constant.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:23:29 +02:00
Artem Bityutskiy
d3cf502b6c UBIFS: various comment improvements and fixes
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:23:08 +02:00
Artem Bityutskiy
21a6025897 UBIFS: improve budgeting dump
Dump available space calculated by budgeting subsystem.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:22:58 +02:00
Artem Bityutskiy
24fa9e9438 UBIFS: fix tnc dumping
debugfs tnc dumping was broken because of an obvious typo.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:22:39 +02:00
Artem Bityutskiy
7bbe5b5aa6 UBIFS: use PAGE_CACHE_MASK correctly
It has high bits set, not low bits set as the UBIFS code
assumed.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-12-23 12:19:14 +02:00
Christoph Hellwig
ad1ad968f4 [XFS] handle unaligned data in xfs_bmbt_disk_get_all
In libxfs xfs_bmbt_disk_get_all needs to handle unaligned data and thus
has been updated to use get_unaligned_be64.  In kernelspace we don't strictly
need it as the routine is only used for tracing and xfsidbg, but let's keep
the two implementations in sync.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-23 11:54:46 +11:00
Christoph Hellwig
efc557570d [XFS] avoid memory allocations in xfs_fs_vcmn_err
xfs_fs_vcmn_err can be called under a spinlock, but does a sleeping memory
allocation to create buffer for it's internal sprintf.  Fortunately it's
the only caller of icmn_err, so we can merge the two and have one single
static buffer and spinlock protecting it.  While we're at it make sure
we proper __attribute__ format annotations so that the compiler can detect
mismatched format strings.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 18:02:01 +11:00
Lachlan McIlroy
9f6c92b9cc [XFS] Fix speculative allocation beyond eof
Speculative allocation beyond eof doesn't work properly.  It was
broken some time ago after a code cleanup that moved what is now
xfs_iomap_eof_align_last_fsb() and xfs_iomap_eof_want_preallocate()
out of xfs_iomap_write_delay() into separate functions.  The code
used to use the current file size in various checks but got changed
to be max(file_size, i_new_size).  Since i_new_size is the result
of 'offset + count' then in xfs_iomap_eof_want_preallocate() the
check for '(offset + count) <= isize' will always be true.

ie if 'offset + count' is > ip->i_size then isize will be i_new_size
and equal to 'offset + count'.

This change fixes all the places that used to use the current file
size.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 17:56:49 +11:00
Lachlan McIlroy
4fdc778179 [XFS] Remove XFS_BUF_SHUT() and friends
Code does nothing so remove it.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 17:52:58 +11:00
Lachlan McIlroy
d415867e0a [XFS] Use the incore inode size in xfs_file_readdir()
We should be using the incore inode size here not the linux inode
size.  The incore inode size is always up to date for directories
whereas the linux inode size is not updated for directories.

We've hit assertions in xfs_bmap() and traced it back to the linux
inode size being zero but the incore size being correct.

Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 17:50:56 +11:00
Ingo Molnar
826e08b015 sched: fix warning in fs/proc/base.c
Stephen Rothwell reported this new (harmless) build warning on platforms that
define u64 to long:

 fs/proc/base.c: In function 'proc_pid_schedstat':
 fs/proc/base.c:352: warning: format '%llu' expects type 'long long unsigned int', but argument 3 has type 'u64'

asm-generic/int-l64.h platforms strike again: that file should be eliminated.

Fix it by casting the parameters to long long.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-22 07:41:06 +01:00
Lachlan McIlroy
27a0464a6c [XFS] Fix merge conflict in fs/xfs/xfs_rename.c
Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

Conflicts:

	fs/xfs/xfs_rename.c

Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-22 17:34:26 +11:00
Julia Lawall
f1d9e4586e fs/9p: change simple_strtol to simple_strtoul
Since v9ses->uid is unsigned, it would seem better to use simple_strtoul that
simple_strtol.

A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r2@
long e;
position p;
@@

e = simple_strtol@p(...)

@@
position p != r2.p;
type T;
T e;
@@

e =
- simple_strtol@p
+ simple_strtoul
  (...)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-12-19 16:50:22 -06:00
Wu Fengguang
7dd0cdc51c 9p: convert d_iname references to d_name.name
d_iname is rubbish for long file names.
Use d_name.name in printks instead.

Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-12-19 16:47:40 -06:00
Duane Griffin
6ff232070a 9p: Remove potentially bad parameter from function entry debug print.
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2008-12-19 16:45:21 -06:00
James Morris
12204e24b1 security: pass mount flags to security_sb_kern_mount()
Pass mount flags to security_sb_kern_mount(), so security modules
can determine if a mount operation is being performed by the kernel.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
2008-12-20 09:02:39 +11:00
Chris Mason
b34b086c1c Btrfs: Fix compile warning around num_online_cpus() in a min statement
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-12-19 15:43:22 -05:00
Yan Zheng
1f80e4db0f Btrfs: set EXTENT_BOUNDARY bit before marking extent delalloc.
There is a race in relocate_inode_pages, it happens when
find_delalloc_range finds the delalloc extent before the
boundary bit is set. Thank you,

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-12-19 10:59:04 -05:00
Yan Zheng
34bf63c4dd Btrfs: properly update block accounting for metadata
This adds the missing block accounting code to finish_current_insert and makes
block accounting for root item properly protected by the delalloc spin lock.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-12-19 10:58:46 -05:00
Yan Zheng
ab67b7c1f7 Btrfs: Add missing mnt_drop_write in ioctl.c
This patch adds the missing mnt_drop_write to match
mnt_want_write in btrfs_ioctl_defrag and btrfs_ioctl_clone

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-12-19 10:58:39 -05:00
Ingo Molnar
30cd324e97 Merge branches 'tracing/ftrace', 'tracing/ring-buffer' and 'tracing/urgent' into tracing/core
Conflicts:
	include/linux/ftrace.h
2008-12-19 09:42:40 +01:00
Ken Chen
9c2c48020e schedstat: consolidate per-task cpu runtime stats
Impact: simplify code

When we turn on CONFIG_SCHEDSTATS, per-task cpu runtime is accumulated
twice. Once in task->se.sum_exec_runtime and once in sched_info.cpu_time.
These two stats are exactly the same.

Given that task->se.sum_exec_runtime is always accumulated by the core
scheduler, sched_info can reuse that data instead of duplicate the accounting.

Signed-off-by: Ken Chen <kenchen@google.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-18 13:54:01 +01:00
Paul Mackerras
c280266a32 Merge branch 'linux-2.6' into next 2008-12-18 11:06:12 +11:00
Linus Torvalds
0bc77ecbe4 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: Add JBD2 compat feature bit.
  ocfs2: Always update xattr search when creating bucket.
2008-12-17 15:01:23 -08:00
Jeff Layton
331c313510 cifs: fix buffer overrun in parse_DFS_referrals
While testing a kernel with memory poisoning enabled, I saw some warnings
about the redzone getting clobbered when chasing DFS referrals. The
buffer allocation for the unicode converted version of the searchName is
too small and needs to take null termination into account.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-12-17 14:59:55 -08:00
Yehuda Sadeh Weinraub
b16281c30c Btrfs: fix return value from btrfs_listxattr when buffer size is too small
The return value was being overwritten.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
2008-12-17 10:21:26 -05:00
Chris Mason
cad321ad52 Btrfs: shift all end_io work to thread pools
bio_end_io for reads without checksumming on and btree writes were
happening without using async thread pools.  This means the extent_io.c
code had to use spin_lock_irq and friends on the rb tree locks for
extent state.

There were some irq safe vs unsafe lock inversions between the delallock
lock and the extent state locks.  This patch gets rid of them by moving
all end_io code into the thread pools.

To avoid contention and deadlocks between the data end_io processing and the
metadata end_io processing yet another thread pool is added to finish
off metadata writes.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-12-17 14:51:42 -05:00
Yan Zheng
87b29b208c Btrfs: properly check free space for tree balancing
btrfs_insert_empty_items takes the space needed by the btrfs_item
structure into account when calculating the required free space.

So the tree balancing code shouldn't add sizeof(struct btrfs_item)
to the size when checking the free space. This patch removes these
superfluous additions.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>
2008-12-17 10:21:48 -05:00
Joel Becker
a97721894a ocfs2: Add JBD2 compat feature bit.
Define the OCFS2_FEATURE_COMPAT_JBD2 bit in the filesystem header.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-12-16 18:26:16 -08:00
Tao Ma
83099bc647 ocfs2: Always update xattr search when creating bucket.
When we create xattr bucket during the process of xattr set, we always
need to update the ocfs2_xattr_search since even if the bucket size is
the same as block size, the offset will change because of the removal
of the ocfs2_xattr_block header.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2008-12-16 14:07:37 -08:00
Chris Mason
dcbdd4dcb9 Btrfs: delete checksum items before marking blocks free
Btrfs maintains a cache of blocks available for allocation in ram.  The
code that frees extents was marking the extents free and then deleting
the checksum items.

This meant it was possible the extent would be reallocated before the
checksum item was actually deleted, leading to races and other
problems as the checksums were updated for the newly allocated extent.

The fix is to delete the checksum before marking the extent free.

Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-12-16 13:51:01 -05:00
Dave Kleikamp
d69e83d99c jfs: ensure symlinks are NUL-terminated
This is an alternate fix for a bug reported and fixed by Duane Griffin.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Reported-by: Duane Griffin <duaneg@dghda.com>
2008-12-16 10:21:34 -06:00
Ingo Molnar
f65cb45cba perfcounters: flush on setuid exec
Pavel Machek pointed out that performance counters should be flushed
when crossing protection domains on setuid execution.

Reported-by: Pavel Machek <pavel@suse.cz>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-16 14:00:15 +01:00
KOSAKI Motohiro
13bd41bc22 proc: enclose desc variable of show_stat() in CONFIG_SPARSE_IRQ
Impact: restructure code to fix compiler warning

commit 240d367b4e moved desc usage point
into #ifdef CONFIG_SPARSE_IRQ.

Eliminate the desc variable, otherwise following warning happens:

 fs/proc/stat.c: In function 'show_stat':
 fs/proc/stat.c:31: warning: unused variable 'desc'

[ akpm: cleaned up the patch to remove #ifdef ]

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-16 11:24:14 +01:00
Anton Vorontsov
6b82b3e4b5 powerpc: Remove `have_of' global variable
The `have_of' variable is a relic from the arch/ppc time, it isn't
useful nowadays.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2008-12-16 15:52:57 +11:00