Commit Graph

15761 Commits

Author SHA1 Message Date
Zhang Le
ee6f779b9e filp->f_pos not correctly updated in proc_task_readdir
filp->f_pos only get updated at the end of the function. Thus d_off of those
dirents who are in the middle will be 0, and this will cause a problem in
glibc's readdir implementation, specifically endless loop. Because when overflow
occurs, f_pos will be set to next dirent to read, however it will be 0, unless
the next one is the last one. So it will start over again and again.

There is a sample program in man 2 gendents. This is the output of the program
running on a multithread program's task dir before this patch is applied:

  $ ./a.out /proc/3807/task
  --------------- nread=128 ---------------
  i-node#  file type  d_reclen  d_off   d_name
    506442  directory    16          1  .
    506441  directory    16          0  ..
    506443  directory    16          0  3807
    506444  directory    16          0  3809
    506445  directory    16          0  3812
    506446  directory    16          0  3861
    506447  directory    16          0  3862
    506448  directory    16          8  3863

This is the output after this patch is applied

  $ ./a.out /proc/3807/task
  --------------- nread=128 ---------------
  i-node#  file type  d_reclen  d_off   d_name
    506442  directory    16          1  .
    506441  directory    16          2  ..
    506443  directory    16          3  3807
    506444  directory    16          4  3809
    506445  directory    16          5  3812
    506446  directory    16          6  3861
    506447  directory    16          7  3862
    506448  directory    16          8  3863

Signed-off-by: Zhang Le <r0bertz@gentoo.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-16 07:51:33 -07:00
Jonathan Corbet
60aa49243d Rationalize fasync return values
Most fasync implementations do something like:

     return fasync_helper(...);

But fasync_helper() will return a positive value at times - a feature used
in at least one place.  Thus, a number of other drivers do:

     err = fasync_helper(...);
     if (err < 0)
             return err;
     return 0;

In the interests of consistency and more concise code, it makes sense to
map positive return values onto zero where ->fasync() is called.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:34:35 -06:00
Jonathan Corbet
76398425bb Move FASYNC bit handling to f_op->fasync()
Removing the BKL from FASYNC handling ran into the challenge of keeping the
setting of the FASYNC bit in filp->f_flags atomic with regard to calls to
the underlying fasync() function.  Andi Kleen suggested moving the handling
of that bit into fasync(); this patch does exactly that.  As a result, we
have a couple of internal API changes: fasync() must now manage the FASYNC
bit, and it will be called without the BKL held.

As it happens, every fasync() implementation in the kernel with one
exception calls fasync_helper().  So, if we make fasync_helper() set the
FASYNC bit, we can avoid making any changes to the other fasync()
functions - as long as those functions, themselves, have proper locking.
Most fasync() implementations do nothing but call fasync_helper() - which
has its own lock - so they are easily verified as correct.  The BKL had
already been pushed down into the rest.

The networking code has its own version of fasync_helper(), so that code
has been augmented with explicit FASYNC bit handling.

Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: David Miller <davem@davemloft.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Jonathan Corbet
db1dd4d376 Use f_lock to protect f_flags
Traditionally, changes to struct file->f_flags have been done under BKL
protection, or with no protection at all.  This patch causes all f_flags
changes after file open/creation time to be done under protection of
f_lock.  This allows the removal of some BKL usage and fixes a number of
longstanding (if microscopic) races.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Jonathan Corbet
6849991490 Rename struct file->f_ep_lock
This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock.  For now,
epoll remains the only user, but a future patch will use it to protect
f_flags as well.

Cc: Davide Libenzi <davidel@xmailserver.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2009-03-16 08:32:27 -06:00
Felix Blyakher
4740cd8b4f Merge branch 'master' of git://git.kernel.org/pub/scm/fs/xfs/xfs 2009-03-16 09:15:27 -05:00
Felix Blyakher
cb1b778040 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2009-03-16 09:15:13 -05:00
Artem Bityutskiy
fb1cd01a33 UBIFS: introduce a helpful variable
This patch introduces a helpful @c->idx_leb_size variable.
The patch also fixes some spelling issues and makes comments
use "LEB" instead of "eraseblock", which is more correct.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-03-16 10:52:02 +02:00
Artem Bityutskiy
c9927c3ee2 UBIFS: use KERN_CONT
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-03-16 10:52:02 +02:00
Artem Bityutskiy
0a6fb8d9c4 UBIFS: fix lprops committing bug
When writing lprop nodes, do not forget to set @from to 0 when
switching the LEB. This fixes the following bug:

UBIFS error (pid 27768): ubifs_leb_write: writing -15456 bytes at 16:15880, error -22
UBIFS error (pid 27768): do_commit: commit failed, error -22
UBIFS warning (pid 27768): ubifs_ro_mode: switched to read-only mode, error -22
Pid: 27768, comm: freespace Not tainted 2.6.29-rc4-ubifs-2.6 #43
Call Trace:
 [<ffffffffa00c46d6>] ubifs_ro_mode+0x54/0x56 [ubifs]
 [<ffffffffa00cfa16>] do_commit+0x4f5/0x50a [ubifs]
 [<ffffffffa00cfae7>] ubifs_run_commit+0xbc/0xdb [ubifs]
 [<ffffffffa00d42b9>] ubifs_budget_space+0x742/0x9ed [ubifs]
 [<ffffffff812daf45>] ? __mutex_lock_common+0x361/0x3ae
 [<ffffffffa00bc437>] ? ubifs_write_begin+0x18d/0x44c [ubifs]
 [<ffffffffa00bc5cb>] ubifs_write_begin+0x321/0x44c [ubifs]
 [<ffffffff8106222b>] ? trace_hardirqs_on_caller+0x1f/0x14d
 [<ffffffff81097ce2>] generic_file_buffered_write+0x12f/0x2d9
 [<ffffffff8109828d>] __generic_file_aio_write_nolock+0x261/0x295
 [<ffffffff81098aff>] generic_file_aio_write+0x69/0xc5
 [<ffffffffa00bb914>] ubifs_aio_write+0x14c/0x19e [ubifs]
 [<ffffffff810c8f42>] do_sync_write+0xe7/0x12d
 [<ffffffff81055378>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff81149edc>] ? security_file_permission+0x11/0x13
 [<ffffffff810c9827>] vfs_write+0xab/0x105
 [<ffffffff810c9945>] sys_write+0x47/0x6f
 [<ffffffff8100c35b>] system_call_fastpath+0x16/0x1b

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-03-16 10:51:51 +02:00
Ingo Molnar
7243f2145a Merge branches 'tracing/ftrace', 'tracing/syscalls' and 'linus' into tracing/core
Conflicts:
	arch/parisc/kernel/irq.c
2009-03-16 09:12:42 +01:00
Dave Chinner
6cc87645e2 xfs: factor out code to find the longest free extent in the AG
Signed-off-by: Dave Chinner <dgc@sgi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2009-03-16 08:29:46 +01:00
Christoph Hellwig
cb4c8cc1e9 xfs: kill VN_BAD
Remove this rather pointless wrapper and use is_bad_inode directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:25:25 +01:00
Christoph Hellwig
8fab451e3c xfs: kill vn_atime_* helpers.
Two out of three are unused already, and the third is better done open-coded
with a comment describing what's going on here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:24:46 +01:00
Christoph Hellwig
076e6acb8f xfs: cleanup xlog_bread
Most callers of xlog_bread need to call xlog_align to get the actual offset.
Consolidate that call into the main xlog_bread and provide a _xlog_bread
for those few that don't want the actual offset.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:24:13 +01:00
Christoph Hellwig
ff0205e032 xfs: cleanup xlog_recover_do_trans
Change the big if-elsif-else block handling the different item types
into a more natural switch, remove assignments in conditionals and
remove an out of place comment from centuries ago on IRIX.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:20:52 +01:00
Christoph Hellwig
dd0bbad81c xfs: remove another leftover of the old inode log item format
There's another little snipplet of code left from the handling of the old
inode log item format in xlog_recover_do_inode_trans.  Kill it as it
can't be reached anymore.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:19:59 +01:00
Christoph Hellwig
21b699c895 xfs: cleanup log unmount handling
Kill the current xfs_log_unmount wrapper and opencode the two function
calls in the only caller.  Rename the current xfs_log_unmount_dealloc to
xfs_log_unmount as it undoes xfs_log_mount and the new name makes that
more clear.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:19:29 +01:00
Artem Bityutskiy
b221337ae4 UBIFS: fix bogus assertion
Empty journal head LEBs are accounted as taken empty as well, so
the GC LEB does not have to be the only taken empty LEB when
nounting/remounting.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-03-15 17:20:22 +02:00
Felix Blyakher
da5309cd28 Fix xfs debug build breakage by pushing xfs_error.h after
xfs_mount.h, which it depends on.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-03-15 08:10:25 -05:00
Linus Torvalds
18553c38bc Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  Fix Xilinx SystemACE driver to handle empty CF slot
  block: fix memory leak in bio_clone()
  block: Add gfp_mask parameter to bio_integrity_clone()
2009-03-14 13:43:18 -07:00
Li Zefan
059ea3318c block: fix memory leak in bio_clone()
If bio_integrity_clone() fails, bio_clone() returns NULL without freeing
the newly allocated bio.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-14 21:06:52 +01:00
un'ichi Nomura
87092698c6 block: Add gfp_mask parameter to bio_integrity_clone()
Stricter gfp_mask might be required for clone allocation.
For example, request-based dm may clone bio in interrupt context
so it has to use GFP_ATOMIC.

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-03-14 21:06:51 +01:00
Linus Torvalds
f1823acfbc Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined...
  SUNRPC: xprt_connect() don't abort the task if the transport isn't bound
  SUNRPC: Fix an Oops due to socket not set up yet...
  Bug 11061, NFS mounts dropped
  NFS: Handle -ESTALE error in access()
  NLM: Fix GRANT callback address comparison when IPv6 is enabled
  NLM: Shrink the IPv4-only version of nlm_cmp_addr()
  NFSv3: Fix posix ACL code
  NFS: Fix misparsing of nfsv4 fs_locations attribute (take 2)
  SUNRPC: Tighten up the task locking rules in __rpc_execute()
2009-03-14 12:00:18 -07:00
Linus Torvalds
ff9cb43ce0 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  ocfs2: Use xs->bucket to set xattr value outside
  ocfs2: Fix a bug found by sparse check.
  ocfs2: tweak to get the maximum inline data size with xattr
  ocfs2: reserve xattr block for new directory with inline data
2009-03-14 11:59:22 -07:00
Tyler Hicks
84814d642a eCryptfs: don't encrypt file key with filename key
eCryptfs has file encryption keys (FEK), file encryption key encryption
keys (FEKEK), and filename encryption keys (FNEK).  The per-file FEK is
encrypted with one or more FEKEKs and stored in the header of the
encrypted file.  I noticed that the FEK is also being encrypted by the
FNEK.  This is a problem if a user wants to use a different FNEK than
their FEKEK, as their file contents will still be accessible with the
FNEK.

This is a minimalistic patch which prevents the FNEKs signatures from
being copied to the inode signatures list.  Ultimately, it keeps the FEK
from being encrypted with a FNEK.

Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Acked-by: Dustin Kirkland <kirkland@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-14 11:57:22 -07:00
Johannes Weiner
15e7b87676 nommu: ramfs: don't leak pages when adding to page cache fails
When a ramfs nommu mapping is expanded, contiguous pages are allocated
and added to the pagecache.  The caller's reference is then passed on
by moving whole pagevecs to the file lru list.

If the page cache adding fails, make sure that the error path also
moves the pagevec contents which might still contain up to PAGEVEC_SIZE
successfully added pages, of which we would leak references otherwise.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Enrik Berkhan <Enrik.Berkhan@ge.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-14 11:57:22 -07:00
Enrik Berkhan
020fe22ff1 nommu: ramfs: pages allocated to an inode's pagecache may get wrongly discarded
The pages attached to a ramfs inode's pagecache by truncation from nothing
- as done by SYSV SHM for example - may get discarded under memory
pressure.

The problem is that the pages are not marked dirty.  Anything that creates
data in an MMU-based ramfs will cause the pages holding that data will
cause the set_page_dirty() aop to be called.

For the NOMMU-based mmap, set_page_dirty() may be called by write(), but
it won't be called by page-writing faults on writable mmaps, and it isn't
called by ramfs_nommu_expand_for_mapping() when a file is being truncated
from nothing to allocate a contiguous run.

The solution is to mark the pages dirty at the point of allocation by the
truncation code.

Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-14 11:57:22 -07:00
Eric Sandeen
8d03c7a0c5 ext4: fix bogus BUG_ONs in in mballoc code
Thiemo Nagel reported that:

# dd if=/dev/zero of=image.ext4 bs=1M count=2
# mkfs.ext4 -v -F -b 1024 -m 0 -g 512 -G 4 -I 128 -N 1 \
  -O large_file,dir_index,flex_bg,extent,sparse_super image.ext4
# mount -o loop image.ext4 mnt/
# dd if=/dev/zero of=mnt/file

oopsed, with a BUG_ON in ext4_mb_normalize_request because
size == EXT4_BLOCKS_PER_GROUP

It appears to me (esp. after talking to Andreas) that the BUG_ON
is bogus; a request of exactly EXT4_BLOCKS_PER_GROUP should
be allowed, though larger sizes do indicate a problem.

Fix that an another (apparently rare) codepath with a similar check.

Reported-by: Thiemo Nagel <thiemo.nagel@ph.tum.de>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-03-14 11:51:46 -04:00
Adrian Hunter
f55aa59106 UBIFS: fix bug where page is marked uptodate when out of space
UBIFS fast path in write_begin may mark a page up to date
and then discover that there may not be enough space to do
the write, and so fall back to a slow path.  The slow path
tries harder, but may still find no space - leaving the page
marked up to date, when it is not.  This patch ensures that
the page is marked not up to date in that case.

The bug that this patch fixes becomes evident when the write
is into a hole (sparse file) or is at the end of the file
and a subsequent read is off the end of the file.  In both
cases, the file system should return zeros but was instead
returning the page that had not been written because the
file system was out of space.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2009-03-14 16:46:33 +02:00
Ingo Molnar
480c93df5b Merge branch 'core/locking' into tracing/ftrace 2009-03-13 01:33:21 +01:00
Tao Ma
712e53e46a ocfs2: Use xs->bucket to set xattr value outside
A long time ago, xs->base is allocated a 4K size and all the contents
in the bucket are copied to the it. Now we use ocfs2_xattr_bucket to
abstract xattr bucket and xs->base is initialized to the start of the
bu_bhs[0]. So xs->base + offset will overflow when the value root is
stored outside the first block.

Then why we can survive the xattr test by now? It is because we always
read the bucket contiguously now and kernel mm allocate continguous
memory for us. We are lucky, but we should fix it. So just get the
right value root as other callers do.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-03-12 16:46:09 -07:00
Tao Ma
74e77eb30d ocfs2: Fix a bug found by sparse check.
We need to use le32_to_cpu to test rec->e_cpos in
ocfs2_dinode_insert_check.

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-03-12 16:46:01 -07:00
Tiger Yang
d9ae49d6e2 ocfs2: tweak to get the maximum inline data size with xattr
Replace max_inline_data with max_inline_data_with_xattr
to ensure it correct when xattr inlined.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-03-12 16:45:46 -07:00
Tiger Yang
6c9fd1dc0a ocfs2: reserve xattr block for new directory with inline data
If this is a new directory with inline data, we choose to
reserve the entire inline area for directory contents and
force an external xattr block.

Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Acked-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
2009-03-12 16:45:40 -07:00
Linus Torvalds
188de5ec56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  Squashfs: Valid filesystems are flagged as bad by the corrupted fs patch
2009-03-12 16:32:36 -07:00
Nick Piggin
7ef0d7377c fs: new inode i_state corruption fix
There was a report of a data corruption
http://lkml.org/lkml/2008/11/14/121.  There is a script included to
reproduce the problem.

During testing, I encountered a number of strange things with ext3, so I
tried ext2 to attempt to reduce complexity of the problem.  I found that
fsstress would quickly hang in wait_on_inode, waiting for I_LOCK to be
cleared, even though instrumentation showed that unlock_new_inode had
already been called for that inode.  This points to memory scribble, or
synchronisation problme.

i_state of I_NEW inodes is not protected by inode_lock because other
processes are not supposed to touch them until I_LOCK (and I_NEW) is
cleared.  Adding WARN_ON(inode->i_state & I_NEW) to sites where we modify
i_state revealed that generic_sync_sb_inodes is picking up new inodes from
the inode lists and passing them to __writeback_single_inode without
waiting for I_NEW.  Subsequently modifying i_state causes corruption.  In
my case it would look like this:

CPU0                            CPU1
unlock_new_inode()              __sync_single_inode()
 reg <- inode->i_state
 reg -> reg & ~(I_LOCK|I_NEW)   reg <- inode->i_state
 reg -> inode->i_state          reg -> reg | I_SYNC
                                reg -> inode->i_state

Non-atomic RMW on CPU1 overwrites CPU0 store and sets I_LOCK|I_NEW again.

Fix for this is rather than wait for I_NEW inodes, just skip over them:
inodes concurrently being created are not subject to data integrity
operations, and should not significantly contribute to dirty memory
either.

After this change, I'm unable to reproduce any of the added warnings or
hangs after ~1hour of running.  Previously, the new warnings would start
immediately and hang would happen in under 5 minutes.

I'm also testing on ext3 now, and so far no problems there either.  I
don't know whether this fixes the problem reported above, but it fixes a
real problem for me.

Cc: "Jorge Boncompte [DTI2]" <jorge@dti2.net>
Reported-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Cc: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-12 16:20:24 -07:00
Li Zefan
a3cfbb53b1 vfs: add missing unlock in sget()
In sget(), destroy_super(s) is called with s->s_umount held, which makes
lockdep unhappy.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-12 16:20:23 -07:00
Oleg Nesterov
e5bc49ba74 pipe_rdwr_fasync: fix the error handling to prevent the leak/crash
If the second fasync_helper() fails, pipe_rdwr_fasync() returns the error
but leaves the file on ->fasync_readers.

This was always wrong, but since 233e70f422
"saner FASYNC handling on file close" we have the new problem.  Because in
this case setfl() doesn't set FASYNC bit, __fput() will not do
->fasync(0), and we leak fasync_struct with ->fa_file pointing to the
freed file.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-03-12 16:20:23 -07:00
Trond Myklebust
9f4c899c0d NFS: Fix the fix to Bugzilla #11061, when IPv6 isn't defined...
Stephen Rothwell reports:

Today's linux-next build (powerpc ppc64_defconfig) failed like this:

fs/built-in.o: In function `.nfs_get_client':
client.c:(.text+0x115010): undefined reference to `.__ipv6_addr_type'

Fix by moving the IPV6 specific parts of commit
d7371c41b0 ("Bug 11061, NFS mounts dropped")
into the '#ifdef IPV6..." section.

Also fix up a couple of formatting issues.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2009-03-12 14:51:32 -04:00
Theodore Ts'o
2842c3b544 ext4: Print the find_group_flex() warning only once
This is a short-term warning, and even printk_ratelimit() can result
in too much noise in system logs.  So only print it once as a warning.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-03-12 12:20:01 -04:00
Phillip Lougher
363911d027 Squashfs: Valid filesystems are flagged as bad by the corrupted fs patch
The corrupted filesystem patch added a check against zlib trying to
output too much data in the presence of data corruption.  This check
triggered if zlib_inflate asked to be called again (Z_OK) with
avail_out == 0 and no more output buffers available.  This check proves
to be rather dumb, as it incorrectly catches the case where zlib has
generated all the output, but there are still input bytes to be processed.

This patch does a number of things.  It removes the original check and
replaces it with code to not move to the next output buffer if there
are no more output buffers available, relying on zlib to error if it
wants an extra output buffer in the case of data corruption.  It
also replaces the Z_NO_FLUSH flag with the more correct Z_SYNC_FLUSH
flag, and makes the error messages more understandable to
non-technical users.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Stefan Lippers-Hollmann <s.L-H@gmx.de>
2009-03-12 03:23:48 +00:00
Steve French
64cc2c6369 [CIFS] work around bug in Samba server handling for posix open
Samba server (version 3.3.1 and earlier, and 3.2.8 and earlier) incorrectly
required the O_CREAT flag on posix open (even when a file was not being
created).  This disables posix open (create is still ok) after the first
attempt returns EINVAL (and logs an error, once, recommending that they
update their server).

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:21 +00:00
Steve French
276a74a483 [CIFS] Use posix open on file open when server supports it
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:21 +00:00
Jeff Layton
fcc7c09d94 cifs: fix buffer format byte on NT Rename/hardlink
Discovered at Connnectathon 2009...

The buffer format byte and the pad are transposed in NT_RENAME calls
(which are used to set hardlinks). Most servers seem to ignore this
fact, but NetApp filers throw back an error due to this problem. This
patch fixes it.

CC: Stable <stable@kernel.org>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:21 +00:00
Steve French
0382457744 [CIFS] Add definitions for remoteably fsctl calls
There are about 60 fsctl calls which Windows claims would be able
to be sent remotely and handled by the server. This adds the #defines
for them.  A few of them look immediately useful, but need to also
add the structure definitions for them so they can be sent as SMBs.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:21 +00:00
Steve French
1adcb71092 [CIFS] add extra null attr check
Although attr == NULL can not happen, this makes cifs_set_file_info safer
in the future since it may not be obvious that the caller can not set
attr to NULL.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:21 +00:00
Steve French
4717bed680 [CIFS] fix build error
Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:20 +00:00
Steve French
7fc8f4e95b [CIFS] reopen file via newer posix open protocol operation if available
If the network connection crashes, and we have to reopen files, preferentially
use the newer cifs posix open protocol operation if the server supports it.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:20 +00:00
Steve French
be652445fd [CIFS] Add new nostrictsync cifs mount option to avoid slow SMB flush
If this mount option is set, when an application does an
fsync call then the cifs client does not send an SMB Flush
to the server (to force the server to write all dirty data
for this file immediately to disk), although cifs still sends
all dirty (cached) file data to the server and waits for the
server to respond to the write write.  Since SMB Flush can be
very slow, and some servers may be reliable enough (to risk
delaying slightly flushing the data to disk on the server),
turning on this option may be useful to improve performance for
applications that fsync too much, at a small risk of server
crash.  If this mount option is not set, by default cifs will
send an SMB flush request (and wait for a response) on every
fsync call.

Signed-off-by: Steve French <sfrench@us.ibm.com>
2009-03-12 01:36:20 +00:00