android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Felix Blyakher	ec91d1335f	xfs: fix double unlock in xfs_swap_extents() Regreesion from commit `ef8f7fc`, which rearranged the code in xfs_swap_extents() leading to double unlock of xfs inode ilock. That resulted in xfs_fsr deadlocking itself on platforms, which don't handle double unlock of rw_semaphore nicely. It caused the count go negative, which represents the write holder, without really having one. ia64 is one of the platforms where deadlock was easily reproduced and the fix was tested. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Felix Blyakher <felixb@sgi.com>	2009-05-08 00:29:44 -05:00
Linus Torvalds	d7a5926978	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: (32 commits) [CIFS] Fix double list addition in cifs posix open code [CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp [CIFS] Fix SMB uid in NTLMSSP authenticate request [CIFS] NTLMSSP reenabled after move from connect.c to sess.c [CIFS] Remove sparse warning [CIFS] remove checkpatch warning [CIFS] Fix final user of old string conversion code [CIFS] remove cifs_strfromUCS_le [CIFS] NTLMSSP support moving into new file, old dead code removed [CIFS] Fix endian conversion of vcnum field [CIFS] Remove trailing whitespace [CIFS] Remove sparse endian warnings [CIFS] Add remaining ntlmssp flags and standardize field names [CIFS] Fix build warning cifs: fix length handling in cifs_get_name_from_search_buf [CIFS] Remove unneeded QuerySymlink call and fix mapping for unmapped status [CIFS] rename cifs_strndup to cifs_strndup_from_ucs Added loop check when mounting DFS tree. Enable dfs submounts to handle remote referrals. [CIFS] Remove older session setup implementation ...	2009-05-07 21:13:24 -07:00
Steve French	90e4ee5d31	[CIFS] Fix double list addition in cifs posix open code Remove adding open file entry twice to lists in the file Do not fill file info twice in case of posix opens and creates Signed-off-by: Shirish Pargaonkar <shirishp@us.ibm.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-08 03:04:30 +00:00
David Teigland	8511a2728a	dlm: fix use count with multiple joins When a lockspace was joined multiple times, the global dlm use count was incremented when it should not have been. This caused the global dlm threads to not be stopped when all lockspaces were eventually be removed. Signed-off-by: David Teigland <teigland@redhat.com>	2009-05-07 10:14:42 -05:00
Geert Uytterhoeven	08ce4c91e4	dlm: Make name input parameter of {,dlm_}new_lockspace() const \| fs/gfs2/lock_dlm.c:207: warning: passing argument 1 of 'dlm_new_lockspace' discards qualifiers from pointer target type Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David Teigland <teigland@redhat.com>	2009-05-07 10:14:26 -05:00
Josef Bacik	df3935ffd6	fiemap: fix problem with setting FIEMAP_EXTENT_LAST Fix a problem where the generic block based fiemap stuff would not properly set FIEMAP_EXTENT_LAST on the last extent. I've reworked things to keep track if we go past the EOF, and mark the last extent properly. The problem was reported by and tested by Eric Sandeen. Tested-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Josef Bacik <jbacik@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: <xfs-masters@oss.sgi.com> Cc: <linux-btrfs@vger.kernel.org> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <Joel.Becker@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-06 16:36:09 -07:00
Wu Fengguang	381a80e6df	inotify: use GFP_NOFS in kernel_event() to work around a lockdep false-positive There is what we believe to be a false positive reported by lockdep. inotify_inode_queue_event() => take inotify_mutex => kernel_event() => kmalloc() => SLOB => alloc_pages_node() => page reclaim => slab reclaim => dcache reclaim => inotify_inode_is_dead => take inotify_mutex => deadlock The plan is to fix this via lockdep annotation, but that is proving to be quite involved. The patch flips the allocation over to GFP_NFS to shut the warning up, for the 2.6.30 release. Hopefully we will fix this for real in 2.6.31. I'll queue a patch in -mm to switch it back to GFP_KERNEL so we don't forget. ================================= [ INFO: inconsistent lock state ] 2.6.30-rc2-next-20090417 #203 --------------------------------- inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage. kswapd0/380 [HC0[0]:SC0[0]:HE1:SE1] takes: (&inode->inotify_mutex){+.+.?.}, at: [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0 {RECLAIM_FS-ON-W} state was registered at: [<ffffffff81079188>] mark_held_locks+0x68/0x90 [<ffffffff810792a5>] lockdep_trace_alloc+0xf5/0x100 [<ffffffff810f5261>] __kmalloc_node+0x31/0x1e0 [<ffffffff81130652>] kernel_event+0xe2/0x190 [<ffffffff81130826>] inotify_dev_queue_event+0x126/0x230 [<ffffffff8112f096>] inotify_inode_queue_event+0xc6/0x110 [<ffffffff8110444d>] vfs_create+0xcd/0x140 [<ffffffff8110825d>] do_filp_open+0x88d/0xa20 [<ffffffff810f6b68>] do_sys_open+0x98/0x140 [<ffffffff810f6c50>] sys_open+0x20/0x30 [<ffffffff8100c272>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff irq event stamp: 690455 hardirqs last enabled at (690455): [<ffffffff81564fe4>] _spin_unlock_irqrestore+0x44/0x80 hardirqs last disabled at (690454): [<ffffffff81565372>] _spin_lock_irqsave+0x32/0xa0 softirqs last enabled at (690178): [<ffffffff81052282>] __do_softirq+0x202/0x220 softirqs last disabled at (690157): [<ffffffff8100d50c>] call_softirq+0x1c/0x50 other info that might help us debug this: 2 locks held by kswapd0/380: #0: (shrinker_rwsem){++++..}, at: [<ffffffff810d0bd7>] shrink_slab+0x37/0x180 #1: (&type->s_umount_key#17){++++..}, at: [<ffffffff8110cfbf>] shrink_dcache_memory+0x11f/0x1e0 stack backtrace: Pid: 380, comm: kswapd0 Not tainted 2.6.30-rc2-next-20090417 #203 Call Trace: [<ffffffff810789ef>] print_usage_bug+0x19f/0x200 [<ffffffff81018bff>] ? save_stack_trace+0x2f/0x50 [<ffffffff81078f0b>] mark_lock+0x4bb/0x6d0 [<ffffffff810799e0>] ? check_usage_forwards+0x0/0xc0 [<ffffffff8107b142>] __lock_acquire+0xc62/0x1ae0 [<ffffffff810f478c>] ? slob_free+0x10c/0x370 [<ffffffff8107c0a1>] lock_acquire+0xe1/0x120 [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0 [<ffffffff81562d43>] mutex_lock_nested+0x63/0x420 [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0 [<ffffffff8112f1b5>] ? inotify_inode_is_dead+0x35/0xb0 [<ffffffff81012fe9>] ? sched_clock+0x9/0x10 [<ffffffff81077165>] ? lock_release_holdtime+0x35/0x1c0 [<ffffffff8112f1b5>] inotify_inode_is_dead+0x35/0xb0 [<ffffffff8110c9dc>] dentry_iput+0xbc/0xe0 [<ffffffff8110cb23>] d_kill+0x33/0x60 [<ffffffff8110ce23>] __shrink_dcache_sb+0x2d3/0x350 [<ffffffff8110cffa>] shrink_dcache_memory+0x15a/0x1e0 [<ffffffff810d0cc5>] shrink_slab+0x125/0x180 [<ffffffff810d1540>] kswapd+0x560/0x7a0 [<ffffffff810ce160>] ? isolate_pages_global+0x0/0x2c0 [<ffffffff81065a30>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8107953d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff810d0fe0>] ? kswapd+0x0/0x7a0 [<ffffffff8106555b>] kthread+0x5b/0xa0 [<ffffffff8100d40a>] child_rip+0xa/0x20 [<ffffffff8100cdd0>] ? restore_args+0x0/0x30 [<ffffffff81065500>] ? kthread+0x0/0xa0 [<ffffffff8100d400>] ? child_rip+0x0/0x20 [eparis@redhat.com: fix audit too] Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matt Mackall <mpm@selenic.com> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Eric Paris <eparis@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-06 16:36:09 -07:00
J. Bruce Fields	89996df4b5	lockd: fix list corruption on lockd restart If lockd is signalled soon enough after restart then locks_start_grace() will try to re-add an entry to a list and trigger a lock corruption warning. Thanks to Wang Chen for the problem report and diagnosis. WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c() ... list_add corruption. next->prev should be prev (ef8fe958), but was ef8ff128. (next=ef8ff128). ... Pid: 23062, comm: lockd Tainted: G W 2.6.30-rc2 #3 Call Trace: [<c042d5b5>] warn_slowpath+0x71/0xa0 [<c0422a96>] ? update_curr+0x11d/0x125 [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c044b270>] ? trace_hardirqs_on+0xb/0xd [<c051c61a>] ? _raw_spin_lock+0x53/0xfa [<c051c89f>] __list_add+0x27/0x5c [<ef8f6daa>] locks_start_grace+0x22/0x30 [lockd] [<ef8f34da>] set_grace_period+0x39/0x53 [lockd] [<c06b8921>] ? lock_kernel+0x1c/0x28 [<ef8f3558>] lockd+0x64/0x164 [lockd] [<c044b12d>] ? trace_hardirqs_on_caller+0x18/0x150 [<c04227b0>] ? complete+0x34/0x3e [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<ef8f34f4>] ? lockd+0x0/0x164 [lockd] [<c043dd42>] kthread+0x45/0x6b [<c043dcfd>] ? kthread+0x0/0x6b [<c0403c23>] kernel_thread_helper+0x7/0x10 Reported-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: stable@kernel.org	2009-05-06 17:19:36 -04:00
Wang Chen	02cb2858db	nfsd: nfs4_stat_init cleanup Save some loop time. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-06 16:22:41 -04:00
J. Bruce Fields	b2c0cea6b1	nfsd4: check for negative dentry before use in nfsv4 readdir After `2f9092e102` "Fix i_mutex vs. readdir handling in nfsd" (and `14f7dd63` "Copy XFS readdir hack into nfsd code"), an entry may be removed between the first mutex_unlock and the second mutex_lock. In this case, lookup_one_len() will return a negative dentry. Check for this case to avoid a NULL dereference. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Reviewed-by: J. R. Okajima <hooanon05@yahoo.co.jp> Cc: stable@kernel.org	2009-05-06 16:16:36 -04:00
Adrian Hunter	6d6cb0d688	UBIFS: reset no_space flag after inode deletion When UBIFS runs out of space it spends a lot of time trying to find more space before returning ENOSPC. As there is no point repeating that unless something has changed, UBIFS has an optimization to record that the file system is 100% full and not try to find space. That flag was not being reset when a pending deletion was finally done. Signed-off-by: Adrian Hunter <adrian.hunter@nokia.com> Reviewed-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2009-05-06 10:37:56 +03:00
Steve French	ac68392460	[CIFS] Allow raw ntlmssp code to be enabled with sec=ntlmssp On mount, "sec=ntlmssp" can now be specified to allow "rawntlmssp" security to be enabled during CIFS session establishment/authentication (ntlmssp used to require specifying krb5 which was counterintuitive). Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-06 04:16:04 +00:00
Steve French	844823cb82	[CIFS] Fix SMB uid in NTLMSSP authenticate request We were not setting the SMB uid in NTLMSSP authenticate request which could lead to INVALID_PARAMETER error on 2nd session setup. Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-06 00:48:30 +00:00
Coly Li	2b53bc7bff	ocfs2: update comments in masklog.h In the mainline ocfs2 code, the interface for masklog is in files under /sys/fs/o2cb/masklog, but the comments in fs/ocfs2/cluster/masklog.h reference the old /proc interface. They are out of date. This patch modifies the comments in cluster/masklog.h, which also provides a bash script example on how to change the log mask bits. Signed-off-by: Coly Li <coly.li@suse.de> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-05-05 14:48:11 -07:00
Tao Ma	a46fa684fc	ocfs2: Don't printk the error when listing too many xattrs. Currently the kernel defines XATTR_LIST_MAX as 65536 in include/linux/limits.h. This is the largest buffer that is used for listing xattrs. But with ocfs2 xattr tree, we actually have no limit for the number. If filesystem has more names than can fit in the buffer, the kernel logs will be pollluted with something like this when listing: (27738,0):ocfs2_iterate_xattr_buckets:3158 ERROR: status = -34 (27738,0):ocfs2_xattr_tree_list_index_block:3264 ERROR: status = -34 So don't print "ERROR" message as this is not an ocfs2 error. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-05-05 14:43:24 -07:00
Jake Edge	f83ce3e6b0	proc: avoid information leaks to non-privileged processes By using the same test as is used for /proc/pid/maps and /proc/pid/smaps, only allow processes that can ptrace() a given process to see information that might be used to bypass address space layout randomization (ASLR). These include eip, esp, wchan, and start_stack in /proc/pid/stat as well as the non-symbolic output from /proc/pid/wchan. ASLR can be bypassed by sampling eip as shown by the proof-of-concept code at http://code.google.com/p/fuzzyaslr/ As part of a presentation (http://www.cr0.org/paper/to-jt-linux-alsr-leak.pdf) esp and wchan were also noted as possibly usable information leaks as well. The start_stack address also leaks potentially useful information. Cc: Stable Team <stable@kernel.org> Signed-off-by: Jake Edge <jake@lwn.net> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-04 15:14:23 -07:00
Steve French	0b3cc85800	[CIFS] NTLMSSP reenabled after move from connect.c to sess.c The NTLMSSP code was removed from fs/cifs/connect.c and merged (75% smaller, cleaner) into fs/cifs/sess.c As with the old code it requires that cifs be built with CONFIG_CIFS_EXPERIMENTAL, the /proc/fs/cifs/Experimental flag must be set to 2, and mount must turn on extended security (e.g. with sec=krb5). Although NTLMSSP encapsulated in SPNEGO is not enabled yet, "raw" ntlmssp is common and useful in some cases since it offers more complete security negotiation, and is the default way of negotiating security for many Windows systems. SPNEGO encapsulated NTLMSSP will be able to reuse the same code. Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-04 08:37:12 +00:00
Randy Dunlap	9064caae8f	nfsd: use C99 struct initializers Eliminate 56 sparse warnings like this one: fs/nfsd/nfs4xdr.c:1331:15: warning: obsolete array initializer, use C99 syntax Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 15:09:12 -04:00
J. Bruce Fields	63e4863fab	nfsd4: make recall callback an asynchronous rpc As with the probe, this removes the need for another kthread. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 15:08:56 -04:00
Andy Adamson	ccecee1e5e	nfsd41: slots are freed with session The session and slots are allocated all in one piece. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-03 14:45:02 -04:00
Trond Myklebust	7fdf523067	NFS: Close page_mkwrite() races Follow up to Nick Piggin's patches to ensure that nfs_vm_page_mkwrite returns with the page lock held, and sets the VM_FAULT_LOCKED flag. See http://bugzilla.kernel.org/show_bug.cgi?id=12913 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 19:42:39 -07:00
Aneesh Kumar K.V	955ce5f5be	ext4: Convert ext4_lock_group to use sb_bgl_lock We have sb_bgl_lock() and ext4_group_info.bb_state bit spinlock to protech group information. The later is only used within mballoc code. Consolidate them to use sb_bgl_lock(). This makes the mballoc.c code much simpler and also avoid confusion with two locks protecting same info. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-02 20:35:09 -04:00
Linus Torvalds	b4348f32da	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix getbmap vs mmap deadlock xfs: a couple getbmap cleanups xfs: add more checks to superblock validation xfs_file_last_byte() needs to acquire ilock	2009-05-02 16:52:50 -07:00
Linus Torvalds	afc1e702e8	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/configfs: configfs: Fix Trivial Warning in fs/configfs/symlink.c	2009-05-02 16:50:46 -07:00
Linus Torvalds	7e567b44e6	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Change repository in MAINTAINERS. ocfs2: Fix a missing credit when deleting from indexed directories. ocfs2/trivial: Remove unused variable in ocfs2_rename. ocfs2: Add missing iput() during error handling in ocfs2_dentry_attach_lock() ocfs2: Fix some printk() warnings. ocfs2: Fix 2 warning during ocfs2 make. ocfs2: Reserve 1 more cluster in expanding_inline_dir for indexed dir.	2009-05-02 16:30:47 -07:00
Theodore Ts'o	eefd7f03b8	ext4: fix the length returned by fiemap for an unallocated extent If the file's blocks have not yet been allocated because of delayed allocation, the length of the extent returned by fiemap is incorrect. This commit fixes this bug. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-02 19:05:37 -04:00
Oleg Nesterov	0ae05fb254	ptrace: s/parent/real_parent/ in binfmt_elf_fdpic.c ->real_parent is the parent. ->parent may be the tracer. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:10 -07:00
KOSAKI Motohiro	00a62ce91e	mm: fix Committed_AS underflow on large NR_CPUS environment The Committed_AS field can underflow in certain situations: > # while true; do cat /proc/meminfo \| grep _AS; sleep 1; done \| uniq -c > 1 Committed_AS: 18446744073709323392 kB > 11 Committed_AS: 18446744073709455488 kB > 6 Committed_AS: 35136 kB > 5 Committed_AS: 18446744073709454400 kB > 7 Committed_AS: 35904 kB > 3 Committed_AS: 18446744073709453248 kB > 2 Committed_AS: 34752 kB > 9 Committed_AS: 18446744073709453248 kB > 8 Committed_AS: 34752 kB > 3 Committed_AS: 18446744073709320960 kB > 7 Committed_AS: 18446744073709454080 kB > 3 Committed_AS: 18446744073709320960 kB > 5 Committed_AS: 18446744073709454080 kB > 6 Committed_AS: 18446744073709320960 kB Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does not check for underflow. But NR_CPUS proportional isn't good calculation. In general, possibility of lock contention is proportional to the number of online cpus, not theorical maximum cpus (NR_CPUS). The current kernel has generic percpu-counter stuff. using it is right way. it makes code simplify and percpu_counter_read_positive() don't make underflow issue. Reported-by: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> [All kernel versions] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:10 -07:00
Ivan Kokshaysky	74641f584d	alpha: binfmt_aout fix This fixes the problem introduced by commit `3bfacef412` (get rid of special-casing the /sbin/loader on alpha): osf/1 ecoff binary segfaults when binfmt_aout built as module. That happens because aout binary handler gets on the top of the binfmt list due to late registration, and kernel attempts to execute the binary without preparatory work that must be done by binfmt_loader. Fixed by changing the registration order of the default binfmt handlers using list_add_tail() and introducing insert_binfmt() function which places new handler on the top of the binfmt list. This might be generally useful for installing arch-specific frontends for default handlers or just for overriding them. Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Richard Henderson <rth@twiddle.net Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:10 -07:00
Vitaly Mayatskikh	0816178638	pagemap: require aligned-length, non-null reads of /proc/pid/pagemap The intention of commit `aae8679b0e` ("pagemap: fix bug in add_to_pagemap, require aligned-length reads of /proc/pid/pagemap") was to force reads of /proc/pid/pagemap to be a multiple of 8 bytes, but now it allows to read 0 bytes, which actually puts some data to user's buffer. According to POSIX, if count is zero, read() should return zero and has no other results. Signed-off-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Cc: Thomas Tuttle <ttuttle@google.com> Acked-by: Matt Mackall <mpm@selenic.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:09 -07:00
Nick Piggin	b827e496c8	mm: close page_mkwrite races Change page_mkwrite to allow implementations to return with the page locked, and also change it's callers (in page fault paths) to hold the lock until the page is marked dirty. This allows the filesystem to have full control of page dirtying events coming from the VM. Rather than simply hold the page locked over the page_mkwrite call, we call page_mkwrite with the page unlocked and allow callers to return with it locked, so filesystems can avoid LOR conditions with page lock. The problem with the current scheme is this: a filesystem that wants to associate some metadata with a page as long as the page is dirty, will perform this manipulation in its ->page_mkwrite. It currently then must return with the page unlocked and may not hold any other locks (according to existing page_mkwrite convention). In this window, the VM could write out the page, clearing page-dirty. The filesystem has no good way to detect that a dirty pte is about to be attached, so it will happily write out the page, at which point, the filesystem may manipulate the metadata to reflect that the page is no longer dirty. It is not always possible to perform the required metadata manipulation in ->set_page_dirty, because that function cannot block or fail. The filesystem may need to allocate some data structure, for example. And the VM cannot mark the pte dirty before page_mkwrite, because page_mkwrite is allowed to fail, so we must not allow any window where the page could be written to if page_mkwrite does fail. This solution of holding the page locked over the 3 critical operations (page_mkwrite, setting the pte dirty, and finally setting the page dirty) closes out races nicely, preventing page cleaning for writeout being initiated in that window. This provides the filesystem with a strong synchronisation against the VM here. - Sage needs this race closed for ceph filesystem. - Trond for NFS (http://bugzilla.kernel.org/show_bug.cgi?id=12913). - I need it for fsblock. - I suspect other filesystems may need it too (eg. btrfs). - I have converted buffer.c to the new locking. Even simple block allocation under dirty pages might be susceptible to i_size changing under partial page at the end of file (we also have a buffer.c-side problem here, but it cannot be fixed properly without this patch). - Other filesystems (eg. NFS, maybe btrfs) will need to change their page_mkwrite functions themselves. [ This also moves page_mkwrite another step closer to fault, which should eventually allow page_mkwrite to be moved into ->fault, and thus avoiding a filesystem calldown and page lock/unlock cycle in __do_fault. ] [akpm@linux-foundation.org: fix derefs of NULL ->mapping] Cc: Sage Weil <sage@newdream.net> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:09 -07:00
Ian Kent	a8985f3ac5	autofs4: fix incorrect return in autofs4_mount_busy() Fix an obvious incorrect return status in autofs4_mount_busy(). Signed-off-by: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-05-02 15:36:09 -07:00
Steve French	24d35add2b	[CIFS] Remove sparse warning Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-02 05:40:39 +00:00
Steve French	989c7e512f	[CIFS] remove checkpatch warning Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-02 05:32:20 +00:00
Steve French	afe48c31ea	[CIFS] Fix final user of old string conversion code Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-02 05:25:46 +00:00
Jeff Layton	3410602732	[CIFS] remove cifs_strfromUCS_le Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-02 04:59:34 +00:00
Steve French	2edd6c5b05	[CIFS] NTLMSSP support moving into new file, old dead code removed Remove dead NTLMSSP support from connect.c prior to addition of the new code to replace it. Signed-off-by: Steve French <sfrench@us.ibm.com>	2009-05-02 04:55:39 +00:00
Eric Sandeen	c9877b205f	ext4: fix for fiemap last-block test Carl Henrik Lunde reported and debugged this; the test for the last allocated block was comparing bytes to blocks in this test: if (logical + length - 1 == EXT_MAX_BLOCK \|\| ext4_ext_next_allocated_block(path) == EXT_MAX_BLOCK) flags \|= FIEMAP_EXTENT_LAST; so any extent which ended right at 4G was stopping the extent walk. Just replacing these values with the extent block & length should fix it. Also give blksize_bits a saner type, and reverse the order of the tests to make the more likely case tested first. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no> Tested-by: Carl Henrik Lunde <chlunde@ping.uio.no> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-01 23:32:06 -04:00
Aneesh Kumar K.V	19ba0559f9	vfs: Enable FS_IOC_FIEMAP and FIGETBSZ for all filetypes The fiemap and get_blk_size ioctls should be enabled even for directories. So move it outisde file_ioctl. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-13 18:12:05 -04:00
Aneesh Kumar K.V	abc8746eb9	ext4: hook fiemap operation for directories Add fiemap callback for directories Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-02 22:54:32 -04:00
Curt Wohlgemuth	f40339031b	ext4: Make the length of the mb_history file tunable In memory-constrained systems with many partitions, the ~68K for each partition for the mb_history buffer can be excessive. This patch adds a new mount option, mb_history_length, as well as a way of setting the default via a module parameter (or via a sysfs parameter in /sys/module/ext4/parameter/default_mb_history_length). If the mb_history_length is set to zero, the mb_history facility is disabled entirely. Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-01 20:27:20 -04:00
J. Bruce Fields	3aea09dc91	nfsd4: track recall retries in nfs4_delegation Move this out of a local variable into the nfs4_delegation object in preparation for making this an async rpc call (at which point we'll need any state like this in a common object that's preserved across function calls). Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 20:11:12 -04:00
J. Bruce Fields	6707bd3d42	nfsd4: remove unused dl_trunc There's no point in keeping this field around--it's always zero. (Background: the protocol allows you to tell the client that the file is about to be truncated, as an optimization to save the client from writing back dirty pages that will just be discarded. We don't implement this hint. If we do some day, adding this field back in will be the least of the work involved.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 19:57:46 -04:00
J. Bruce Fields	b53d40c507	nfsd4: eliminate struct nfs4_cb_recall The nfs4_cb_recall struct is used only in nfs4_delegation, so its pointer to the containing delegation is unnecessary--we could just use container_of(). But there's no real reason to have this a separate struct at all--just move these fields to nfs4_delegation. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 19:50:00 -04:00
Theodore Ts'o	bb23c20a85	ext4: Move fs/ext4/group.h into ext4.h Move the function prototypes in group.h into ext4.h so they are all defined in one place. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-01 19:44:44 -04:00
J. Bruce Fields	c237dc0303	nfsd4: rename callback struct to cb_conn I want to use the name for a struct that actually does represent a single callback. (Actually, I've never been sure it helps to a separate struct for the callback information. Some day maybe those fields could just be dumped into struct nfs4_client. I don't know.) Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-05-01 17:31:44 -04:00
Theodore Ts'o	596397b77c	ext4: Move fs/ext4/namei.h into ext4.h The fs/ext4/namei.h header file had only a single function declaration, and should have never been a standalone file. Move it into ext4.h, where should have been from the beginning. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-01 13:49:15 -04:00
Theodore Ts'o	ca0faba0e8	ext4: Move the ext4_sb.h header file into ext4.h There is no longer a reason for a separate ext4_sb.h header file, so move it into ext4.h just to make life easier for developers to find the relevant data structures and typedefs. Should also speed up compiles slightly, too. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-03 16:33:44 -04:00
Theodore Ts'o	d444c3c381	ext4: Move the ext4_i.h header file into ext4.h There is no longer a reason for a separate ext4_i.h header file, so move it into ext4.h just to make life easier for developers to find the relevant data structures and typedefs. Should also speed up compiles slightly, too. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-05-01 13:44:33 -04:00
Theodore Ts'o	75507efb13	ext4: Don't avoid using BLOCK_UNINIT block groups in mballoc By avoiding the use of not-yet-used block groups (i.e., block groups with the BLOCK_UNINIT flag), mballoc had a tendency to create large files with large non-contiguous gaps. In addition avoiding the use of new block groups had a tendency to push regular file data into the first block group in a flex_bg group, which slows down the speed of e2fsck pass 2, since it has a tendency to seek much more. For example: Before Patch After Patch Time in seconds Time in seconds Real / User/ Sys MB/s Real / User/ Sys MB/s Pass 1 8.52 / 2.21 / 0.46 20.43 8.84 / 4.97 / 1.11 19.68 Pass 2 21.16 / 1.02 / 1.86 11.30 6.54 / 1.77 / 1.78 36.39 Pass 3 0.01 / 0.00 / 0.00 139.00 0.01 / 0.01 / 0.00 128.90 Pass 4 0.16 / 0.15 / 0.00 0.00 0.17 / 0.17 / 0.00 0.00 Pass 5 2.52 / 1.99 / 0.09 0.79 2.31 / 1.78 / 0.06 0.86 Total 32.40 / 5.11 / 2.49 12.81 17.99 / 8.75 / 2.98 23.01 This was on a sample 80 gig root filesystem which was approximately 50% full. Note the improved e2fsck pass 2 performance, by over a factor of 3, due to a decreased number of seeks. (The total amount of I/O in pass 2 was unchanged; the layout of the directory blocks was simply much better from e2fsck's's perspective.) Other changes as a result of this patch on this sample filesystem: Before Patch After Patch # of non-contig files 762 779 # of non-contig directories 571 570 # of BLOCK_UNINIT bg's 307 293 # of INODE_UNINIT bg's 503 503 Out of 640 block groups, of which 333 were in use, this patch caused an extra 14 block groups to be utilized. The number of non-contiguous files did go up slightly, but when measured against the 99.9% of the files (603,154) which were contiguously allocated, this is pretty insignificant. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andreas Dilger <adilger@sun.com>	2009-05-01 12:58:36 -04:00

... 36 37 38 39 40 ...

15761 Commits