Commit Graph

30831 Commits

Author SHA1 Message Date
Josef Bacik
ee670f0af3 Btrfs: fix btrfs_destroy_marked_extents
So we're forcing the eb's to have their ref count set to 1 so invalidatepage
works but this breaks lots of things, for example root nodes, and is just
plain wrong, we don't need to just evict all of this stuff.  Also drop the
invalidatepage altogether and add a page_cache_release().  With this patch
we no longer hang when trying to access the root nodes after an aborted
transaction and we no longer leak memory.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-06-14 21:29:14 -04:00
Josef Bacik
7b8b92af58 Btrfs: abort the transaction if the commit fails
If a transaction commit fails we don't abort it so we don't set an error on
the file system.  This patch fixes that by actually calling the abort stuff
and then adding a check for a fs error in the transaction start stuff to
make sure it is caught properly.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-06-14 21:29:13 -04:00
Josef Bacik
d7096fc3ef Btrfs: wake up transaction waiters when aborting a transaction
I was getting lots of hung tasks and a NULL pointer dereference because we
are not cleaning up the transaction properly when it aborts.  First we need
to reset the running_transaction to NULL so we don't get a bad dereference
for any start_transaction callers after this.  Also we cannot rely on
waitqueue_active() since it's just a list_empty(), so just call wake_up()
directly since that will do the barrier for us and such.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-06-14 21:29:12 -04:00
Josef Bacik
b939d1ab76 Btrfs: fix locking in btrfs_destroy_delayed_refs
The transaction abort stuff was throwing warnings from the list debugging
code because we do a list_del_init outside of the delayed_refs spin lock.
The delayed refs locking makes baby Jesus cry so it's not hard to get wrong,
but we need to take the ref head mutex to make sure it's not being processed
currently, and so if it is we need to drop the spin lock and then take and
drop the mutex and do the search again.  If we can take the mutex then we
can safely remove the head from the list and carry on.  Now when the
transaction aborts I don't get the list debugging warnings.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-06-14 21:29:11 -04:00
Josef Bacik
beb42dd793 Btrfs: pass locked_page into extent_clear_unlock_delalloc if theres an error
While doing my enospc work I got a transaction abortion that resulted in a
panic when we tried to unlock_page() an already unlocked page.  This is
because we aren't calling extent_clear_unlock_delalloc with the locked page
so it was unlocking all the pages in the range.  This is wrong since
__extent_writepage expects to have the page locked still unless we return
*page_started as 1.  This should keep us from panicing.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
2012-06-14 21:29:09 -04:00
J. Bruce Fields
bc2df47a40 nfsd4: BUG_ON(!is_spin_locked()) no good on UP kernels
Most frequent symptom was a BUG triggering in expire_client, with the
server locking up shortly thereafter.

Introduced by 508dc6e110 "nfsd41:
free_session/free_client must be called under the client_lock".

Cc: stable@kernel.org
Cc: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-14 13:54:08 -04:00
Stanislav Kinsbursky
12918b10d5 NFS: hard-code init_net for NFS callback transports
In case of destroying mount namespace on child reaper exit, nsproxy is zeroed
to the point already. So, dereferencing of it is invalid.
This patch hard-code "init_net" for all network namespace references for NFS
callback services. This will be fixed with proper NFS callback
containerization.

Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2012-06-14 13:53:43 -04:00
Chen Baozi
51c84223af xfs: fix typo in comment of xfs_dinode_t.
There should be "XFS_DFORK_DPTR, XFS_DFORK_APTR, and XFS_DFORK_PTR" instead
of "XFS_DFORK_PTR, XFS_DFORK_DPTR, and XFS_DFORK_PTR".

Signed-off-by: Chen Baozi <baozich@gmail.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:26 -05:00
Dave Chinner
5276432997 xfs: kill copy and paste segment checks in xfs_file_aio_read
The generic segment check code now returns a count of the number of
bytes in the iovec, so we don't need to roll our own anymore.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:25 -05:00
Dave Chinner
32972383ca xfs: make largest supported offset less shouty
XFS_MAXIOFFSET() is just a simple macro that resolves to
mp->m_maxioffset. It doesn't need to exist, and it just makes the
code unnecessarily loud and shouty.

Make it quiet and easy to read.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:24 -05:00
Dave Chinner
d2c2819117 xfs: m_maxioffset is redundant
The m_maxioffset field in the struct xfs_mount contains the same
value as the superblock s_maxbytes field. There is no need to carry
two copies of this limit around, so use the VFS superblock version.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:22 -05:00
Jeff Liu
0f2cf9d3d9 xfs: fix debug_object WARN at xfs_alloc_vextent()
Fengguang reports:

[  780.529603] XFS (vdd): Ending clean mount
[  781.454590] ODEBUG: object is on stack, but not annotated
[  781.455433] ------------[ cut here ]------------
[  781.455433] WARNING: at /c/kernel-tests/sound/lib/debugobjects.c:301 __debug_object_init+0x173/0x1f1()
[  781.455433] Hardware name: Bochs
[  781.455433] Modules linked in:
[  781.455433] Pid: 26910, comm: kworker/0:2 Not tainted 3.4.0+ #51
[  781.455433] Call Trace:
[  781.455433]  [<ffffffff8106bc84>] warn_slowpath_common+0x83/0x9b
[  781.455433]  [<ffffffff8106bcb6>] warn_slowpath_null+0x1a/0x1c
[  781.455433]  [<ffffffff814919a5>] __debug_object_init+0x173/0x1f1
[  781.455433]  [<ffffffff81491c65>] debug_object_init+0x14/0x16
[  781.455433]  [<ffffffff8108842a>] __init_work+0x20/0x22
[  781.455433]  [<ffffffff8134ea56>] xfs_alloc_vextent+0x6c/0xd5

Use INIT_WORK_ONSTACK in xfs_alloc_vextent instead of INIT_WORK.

Reported-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:21 -05:00
Alain Renaud
7d0fa3ecba xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2)
On filesytems with a block size smaller than PAGE_SIZE we currently have
a problem with unwritten extents.  If a we have multi-block page for
which an unwritten extent has been allocated, and only some of the
buffers have been written to, and they are not contiguous, we can expose
stale data from disk in the blocks between the writes after extent
conversion.

Example of a page with unwritten and real data.
buffer  content
0       empty  b_state = 0
1       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
2       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
3       empty  b_state = 0
4       empty  b_state = 0
5       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
6       DATA   b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten
7       empty  b_state = 0

Buffers 1, 2, 5, and 6 have been written to, leaving 0, 3, 4, and 7
empty.  Currently buffers 1, 2, 5, and 6 are added to a single ioend,
and when IO has completed, extent conversion creates a real extent from
block 1 through block 6, leaving 0 and 7 unwritten.  However buffers 3
and 4 were not written to disk, so stale data is exposed from those
blocks on a subsequent read.

Fix this by setting iomap_valid = 0 when we find a buffer that is not
Uptodate.  This ensures that buffers 5 and 6 are not added to the same
ioend as buffers 1 and 2.  Later these blocks will be converted into two
separate real extents, leaving the blocks in between unwritten.

Signed-off-by: Alain Renaud <arenaud@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-06-14 12:28:20 -05:00
Jan Schmidt
3310c36eef Btrfs: fix race in tree mod log addition
When adding to the tree modification log, we grab two locks at different
stages. We must not drop the outer lock until we're done with section
protected by the inner lock. This moves the unlock call for the outer lock
to the appropriate position.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:52:39 +02:00
Jan Schmidt
3d7806eca4 Btrfs: add btrfs_next_old_leaf
To make sense of the tree mod log, the backref walker not only needs
btrfs_search_old_slot, but it also called btrfs_next_leaf, which in turn was
calling btrfs_search_slot. This obviously didn't give the correct result.

This commit adds btrfs_next_old_leaf, a drop-in replacement for
btrfs_next_leaf with a time_seq parameter. If it is zero, it behaves exactly
like btrfs_next_leaf. If it is non-zero, it will use btrfs_search_old_slot
with this time_seq parameter.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:52:09 +02:00
Jan Schmidt
a95236d99f Btrfs: fix return value for __tree_mod_log_oldest_root
In __tree_mod_log_oldest_root() we must return the found operation even if
it's not a ROOT_REPLACE operation. Otherwise, the caller assumes that there
are no operations to be rewinded and returns immediately.

The code in the caller is modified to improve readability.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:44:22 +02:00
Jan Schmidt
8ba97a15e7 Btrfs: use btrfs_read_lock_root_node in get_old_root
get_old_root could race with root node updates because we weren't locking
the node early enough. Use btrfs_read_lock_root_node to grab the root locked
in the very beginning and release the lock as soon as possible (just like
btrfs_search_slot does).

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:44:21 +02:00
Jan Schmidt
f617e2fd52 Btrfs: remove obsolete btrfs_next_leaf call from __resolve_indirect_ref
When resolving indirect refs, we used to call btrfs_next_leaf in case we
didn't find an exact match. While we should find exact matches most of the
time, in case we don't, we must continue searching. Treating those matches
differently depending on the level we're searching doesn't make sense.

Even worse, we might end up searching for a key larger than the largest, in
which case there is no next_leaf and subsequent jobs would fail. This commit
drops the bogous lines.

Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
2012-06-14 18:44:20 +02:00
Bob Peterson
666d1d8ad2 GFS2: Combine functions get_local_rgrp and gfs2_inplace_reserve
This function combines rgrp functions get_local_rgrp and
gfs2_inplace_reserve so that the double retry loop is gone.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-06-14 09:58:40 +01:00
Anton Vorontsov
521f7288a8 pstore/platform: Disable automatic updates by default
Having automatic updates seems pointless for production system, and
even dangerous and thus counter-productive:

1. If we can mount pstore, or read files, we can as well read
   /proc/kmsg. So, there's little point in duplicating the
   functionality and present the same information but via another
   userland ABI;

2. Expecting the kernel to behave sanely after oops/panic is naive.
   It might work, but you'd rather not try it. Screwed up kernel
   can do rather bad things, like recursive faults[1]; and pstore
   rather provoking bad things to happen. It uses:

   1. Timers (assumes sane interrupts state);
   2. Workqueues and mutexes (assumes scheduler in a sane state);
   3. kzalloc (a working slab allocator);

   That's too much for a dead kernel, so the debugging facility
   itself might just make debugging harder, which is not what
   we want.

Maybe for non-oops message types it would make sense to re-enable
automatic updates, but so far I don't see any use case for this.
Even for tracing, it has its own run-time/normal ABI, so we're
only interested in pstore upon next boot, to retrieve what has
gone wrong with HW or SW.

So, let's disable the updates by default.

[1]
BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [<ffffffff8104801b>] kthread_data+0xb/0x20
[...]
Process kworker/0:1 (pid: 14, threadinfo ffff8800072c0000, task ffff88000725b100)
[...
Call Trace:
 [<ffffffff81043710>] wq_worker_sleeping+0x10/0xa0
 [<ffffffff813687a8>] __schedule+0x568/0x7d0
 [<ffffffff8106c24d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81087e22>] ? call_rcu_sched+0x12/0x20
 [<ffffffff8102b596>] ? release_task+0x156/0x2d0
 [<ffffffff8102b45e>] ? release_task+0x1e/0x2d0
 [<ffffffff8106c24d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81368ac4>] schedule+0x24/0x70
 [<ffffffff8102cba8>] do_exit+0x1f8/0x370
 [<ffffffff810051e7>] oops_end+0x77/0xb0
 [<ffffffff8135c301>] no_context+0x1a6/0x1b5
 [<ffffffff8135c4de>] __bad_area_nosemaphore+0x1ce/0x1ed
 [<ffffffff81053156>] ? ttwu_queue+0xc6/0xe0
 [<ffffffff8135c50b>] bad_area_nosemaphore+0xe/0x10
 [<ffffffff8101fa47>] do_page_fault+0x2c7/0x450
 [<ffffffff8106e34b>] ? __lock_release+0x6b/0xe0
 [<ffffffff8106bf21>] ? mark_held_locks+0x61/0x140
 [<ffffffff810502fe>] ? __wake_up+0x4e/0x70
 [<ffffffff81185f7d>] ? trace_hardirqs_off_thunk+0x3a/0x3c
 [<ffffffff81158970>] ? pstore_register+0x120/0x120
 [<ffffffff8136a37f>] page_fault+0x1f/0x30
 [<ffffffff81158970>] ? pstore_register+0x120/0x120
 [<ffffffff81185ab8>] ? memcpy+0x68/0x110
 [<ffffffff8115875a>] ? pstore_get_records+0x3a/0x130
 [<ffffffff811590f4>] ? persistent_ram_copy_old+0x64/0x90
 [<ffffffff81158bf4>] ramoops_pstore_read+0x84/0x130
 [<ffffffff81158799>] pstore_get_records+0x79/0x130
 [<ffffffff81042536>] ? process_one_work+0x116/0x450
 [<ffffffff81158970>] ? pstore_register+0x120/0x120
 [<ffffffff8115897e>] pstore_dowork+0xe/0x10
 [<ffffffff81042594>] process_one_work+0x174/0x450
 [<ffffffff81042536>] ? process_one_work+0x116/0x450
 [<ffffffff81042e13>] worker_thread+0x123/0x2d0
 [<ffffffff81042cf0>] ? manage_workers.isra.28+0x120/0x120
 [<ffffffff81047d8e>] kthread+0x8e/0xa0
 [<ffffffff8136ba74>] kernel_thread_helper+0x4/0x10
 [<ffffffff8136a199>] ? retint_restore_args+0xe/0xe
 [<ffffffff81047d00>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff8136ba70>] ? gs_change+0xb/0xb
Code: be e2 00 00 00 48 c7 c7 d1 2a 4e 81 e8 bf fb fd ff 48 8b 5d f0 4c 8b 65 f8 c9 c3 0f 1f 44 00 00 48 8b 87 08 02 00 00 55 48 89 e5 <48> 8b 40 f8 5d c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00
RIP  [<ffffffff8104801b>] kthread_data+0xb/0x20
 RSP <ffff8800072c1888>
CR2: fffffffffffffff8
---[ end trace 996a332dc399111d ]---
Fixing recursive fault but reboot is needed!

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:37 -07:00
Anton Vorontsov
a3f5f075c2 pstore/platform: Make automatic updates interval configurable
There is no behavioural change, the default value is still 60 seconds.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:37 -07:00
Anton Vorontsov
b8587daa75 pstore/ram_core: Remove now unused code
The code tried to maintain the global list of persistent ram zones,
which isn't a great idea overall, plus since Android's ram_console
is no longer there, we can remove some unused functions.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:37 -07:00
Anton Vorontsov
602b5be4f1 pstore/ram_core: Silence some printks
Since we use multiple regions, the messages are somewhat annoying.
We do print total mapped memory already, so no need to print the
information for each region in the library routines.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:28 -07:00
Anton Vorontsov
b5d38e9bf1 pstore/ram: Add console messages handling
The console log size is configurable via ramoops.console_size
module option, and the log itself is available via
<pstore-mount>/console-ramoops file.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:28 -07:00
Anton Vorontsov
755d66b48f pstore/ram: Factor ramoops_get_next_prz() out of ramoops_pstore_read()
This will help make code clearer when we'll add support for other
message types.

The patch also changes return value from -EINVAL to 0 in case of
end-of-records. The exact value doesn't matter for pstore (it should
be just <= 0), but 0 feels more correct.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:28 -07:00
Anton Vorontsov
f4c5d2423c pstore/ram: Factor dmesg przs initialization out of probe()
This will help make code clearer when we'll add support for other
message types.

This also makes probe() much shorter and understandable, plus
makes mem/record size checking a bit easier.

Implementation detail: we now use a paddr pointer, this will
be used for allocating persistent ram zones for other message
types.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:28 -07:00
Anton Vorontsov
cac2eb7b58 pstore/ram: Give proper names to dump-related variables
We're about to add support for other message types, so let's rename
some variables to not be confused later.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:28 -07:00
Anton Vorontsov
f29e5956ae pstore: Add console log messages support
Pstore doesn't support logging kernel messages in run-time, it only
dumps dmesg when kernel oopses/panics. This makes pstore useless for
debugging hangs caused by HW issues or improper use of HW (e.g.
weird device inserted -> driver tried to write a reserved bits ->
SoC hanged. In that case we don't get any messages in the pstore.

Therefore, let's add a runtime logging support: PSTORE_TYPE_CONSOLE.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Colin Cross <ccross@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:59:27 -07:00
Anton Vorontsov
364ed2f465 pstore/inode: Make pstore_fill_super() static
There's no reason to extern it. The patch fixes the annoying sparse
warning:

CHECK   fs/pstore/inode.c
fs/pstore/inode.c:264:5: warning: symbol 'pstore_fill_super' was not
declared. Should it be static?

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:52:40 -07:00
Anton Vorontsov
93cce04968 pstore/ram: Should zap persistent zone on unlink
Otherwise, unlinked file will reappear on the next boot.

Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:52:40 -07:00
Anton Vorontsov
fce3979304 pstore/ram_core: Factor persistent_ram_zap() out of post_init()
A handy function that we will use outside of ram_core soon. But
so far just factor it out and start using it in post_init().

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:52:40 -07:00
Anton Vorontsov
25b63da647 pstore/ram_core: Do not reset restored zone's position and size
Otherwise, the files will survive just one reboot, and on a subsequent
boot they will disappear.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:52:39 -07:00
Anton Vorontsov
201e4aca5a pstore/ram: Should update old dmesg buffer before reading
Without the update, we'll only see the new dmesg buffer after the
reboot, but previously we could see it right away. Making an oops
visible in pstore filesystem before reboot is a somewhat dubious
feature, but removing it wasn't an intentional change, so let's
restore it.

For this we have to make persistent_ram_save_old() safe for calling
multiple times, and also extern it.

Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:52:39 -07:00
Arend van Spriel
a59d6293e5 debugfs: change parameter check in debugfs_remove() functions
The dentry parameter in debugfs_remove() and debugfs_remove_recursive()
is checked being a NULL pointer. To make cleanup by callers easier this
check is extended using the IS_ERR_OR_NULL macro instead because the
debugfs_create_... functions can return a ERR_PTR() value.

Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-06-13 16:40:41 -07:00
Eric Dumazet
047fe36052 splice: fix racy pipe->buffers uses
Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered
by splice_shrink_spd() called from vmsplice_to_pipe()

commit 35f3d14dbb (pipe: add support for shrinking and growing pipes)
added capability to adjust pipe->buffers.

Problem is some paths don't hold pipe mutex and assume pipe->buffers
doesn't change for their duration.

Fix this by adding nr_pages_max field in struct splice_pipe_desc, and
use it in place of pipe->buffers where appropriate.

splice_shrink_spd() loses its struct pipe_inode_info argument.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Tom Herbert <therbert@google.com>
Cc: stable <stable@vger.kernel.org> # 2.6.35
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2012-06-13 21:16:42 +02:00
Bob Peterson
0d515210b6 GFS2: Add kobject release method
This patch adds a kobject release function that properly maintains
the kobject use count, so that accesses to the sysfs files do not
cause an access to freed kernel memory after an unmount.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2012-06-13 15:59:48 +01:00
Suresh Jayaraman
e73f843a32 cifs: fix parsing of password mount option
The double delimiter check that allows a comma in the password parsing code is
unconditional. We set "tmp_end" to the end of the string and we continue to
check for double delimiter. In the case where the password doesn't contain a
comma we end up setting tmp_end to NULL and eventually setting "options" to
"end". This results in the premature termination of the options string and hence
the values of UNCip and UNC are being set to NULL. This results in mount failure
with "Connecting to DFS root not implemented yet" error.

This error is usually not noticable as we have password as the last option in
the superblock mountdata. But when we call expand_dfs_referral() from
cifs_mount() and try to compose mount options for the submount, the resulting
mountdata will be of the form

   ",ver=1,user=foo,pass=bar,ip=x.x.x.x,unc=\\server\share"

and hence results in the above error. This bug has been seen with older NAS
servers running Samba 3.0.24.

Fix this by moving the double delimiter check inside the conditional loop.

Changes since -v1

   - removed the wrong strlen() micro optimization.

Signed-off-by: Suresh Jayaraman <sjayaraman@suse.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Cc: stable@vger.kernel.org [3.1+]
Signed-off-by: Steve French <sfrench@us.ibm.com>
2012-06-12 12:53:02 -05:00
Linus Torvalds
266ae4e615 Merge tag 'writeback-lock-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux
Pull writeback locking fix from Wu Fengguang:
 "fix unbalanced wb->list_lock in 3.5-rc1"

* tag 'writeback-lock-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
  writeback: Fix lock imbalance in writeback_sb_inodes()
2012-06-12 18:28:58 +03:00
Dan Carpenter
e216c8c771 NFS: add an endian notation for sparse
This is supposed to be a __be32 value.  Sparse complains a lot:

fs/nfs/callback_xdr.c:699:30: warning: incorrect type in initializer (different base types)
fs/nfs/callback_xdr.c:699:30:    expected unsigned int [unsigned] status
fs/nfs/callback_xdr.c:699:30:    got restricted __be32 const [usertype] csr_status
fs/nfs/callback_xdr.c:715:9: warning: cast to restricted __be32
fs/nfs/callback_xdr.c:716:16: warning: incorrect type in return expression (different base types)
fs/nfs/callback_xdr.c:716:16:    expected restricted __be32
fs/nfs/callback_xdr.c:716:16:    got unsigned int [unsigned] status

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-12 09:54:40 -04:00
Dan Carpenter
0439f31c35 NFSv4.1: integer overflow in decode_cb_sequence_args()
This seems like it could overflow on 32 bits.  Use kmalloc_array() which
has overflow protection built in.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-12 09:54:36 -04:00
Randy Dunlap
b84297197c exofs: fix sparse non-ANSI function warning
Fix sparse non-ANSI function warning:

  fs/exofs/sys.c:112:28: warning: non-ANSI function declaration of function 'exofs_sysfs_dbg_print'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-06-12 06:33:22 +03:00
Andy Adamson
2669940db8 NFSv4 do not send an empty SETATTR compound
Commit 536e43d12b ATTR_OPEN check can result in
an ia_valid with only ATTR_FILE set, and no NFS_VALID_ATTRS attributes to
request from the server.

Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-11 17:25:53 -04:00
Sachin Prabhu
64f9a83665 NFSv2: EOF incorrectly set on short read
In cases where the server returns fewer bytes then those requested, we
can incorrectly set the eof flag for the file. Fixing this allows the
request to be retried with updated offset and count arguments.

Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-11 17:25:00 -04:00
Steven Whitehouse
0fe2f1e929 GFS2: Size seq_file buffer more carefully
This places a limit on the buffer size for archs with larger
PAGE_SIZE.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
2012-06-11 13:49:47 +01:00
Steven Whitehouse
1bb49303b7 GFS2: Use seq_vprintf for glocks debugfs file
Make use of the newly added seq_vprintf() function.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
2012-06-11 13:26:50 +01:00
Steven Whitehouse
a4808147dc seq_file: Add seq_vprintf function and export it
The existing seq_printf function is rewritten in terms of the new
seq_vprintf which is also exported to modules. This allows GFS2
(and potentially other seq_file users) to have a vprintf based
interface and to avoid an extra copy into a temporary buffer in
some cases.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
2012-06-11 13:16:35 +01:00
Bryan Schumaker
c5afc8da5b NFS: Use the NFS_DEFAULT_VERSION for v2 and v3 mounts
Older versions of nfs utils don't always pass a "vers=" mount option for
NFS.  This chould lead to attempts at using NFS v0 due to a zeroed out
nfs_parsed_mount_data struct.  I solve this by setting the default NFS
version to NFS_DEFAULT_VERSION in the v2 and v3 cases (v4 has already been
taken care of by a similar patch).

Reported-by: Joerg Roedel <joro@&bytes.org>
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-09 14:38:59 -04:00
Fred Isaman
906369e43c NFS: fix directio refcount bug on commit
This reverts a hunk from commit 0427708657
"NFS: Clean up - Simplify reference counting in fs/nfs/direct.c"

The cleanups in that patch affect the write path, but by the time
processing hits commit the removed reference has been added back by
nfs_scan_commit_list().  Without this reversion, any page that is
sent to commit holds on to an unbalanced reference that is never
freed.  The immediate effect is an imbalance over the wire between
OPENs and CLOSEs.

Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2012-06-09 14:32:45 -04:00
Wanpeng Li
331cbdeede writeback: Fix some comment errors
Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-06-09 19:54:47 +08:00
Jan Kara
ead188f9f9 writeback: Fix lock imbalance in writeback_sb_inodes()
Fix bug introduced by 169ebd90.  We have to have wb_list_lock locked when
restarting writeback loop after having waited for inode writeback.

Bug description by Ted Tso:

  I can reproduce this fairly easily by using ext4 w/o a journal, running
  under KVM with 1024megs memory, with fsstress (xfstests #13):

  [   45.153294] =====================================
  [   45.154784] [ BUG: bad unlock balance detected! ]
  [   45.155591] 3.5.0-rc1-00002-gb22b1f1 #124 Not tainted
  [   45.155591] -------------------------------------
  [   45.155591] flush-254:16/2499 is trying to release lock (&(&wb->list_lock)->rlock) at:
  [   45.155591] [<c022c3da>] writeback_sb_inodes+0x160/0x327
  [   45.155591] but there are no more locks to release!

Reported-by: Theodore Ts'o <tytso@mit.edu>
Tested-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2012-06-09 08:32:15 +09:00