android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Theodore Ts'o	021b65bb1e	ext4: Rename ext4_free_blks_{count,set}() to refer to clusters The field bg_free_blocks_count_{lo,high} in the block group descriptor has been repurposed to hold the number of free clusters for bigalloc functions. So rename the functions so it makes it easier to read and audit the block allocation and block freeing code. Note: at this point in bigalloc development we doesn't support online resize, so this also makes it really obvious all of the places we need to fix up to add support for online resize. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 19:08:51 -04:00
Theodore Ts'o	6f16b60690	ext4: enable mounting bigalloc as read/write Now that we have implemented all of the changes needed for bigalloc, we can finally enable it! Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 19:06:51 -04:00
Aditya Kali	7b415bf60f	ext4: Fix bigalloc quota accounting and i_blocks value With bigalloc changes, the i_blocks value was not correctly set (it was still set to number of blocks being used, but in case of bigalloc, we want i_blocks to represent the number of clusters being used). Since the quota subsystem sets the i_blocks value, this patch fixes the quota accounting and makes sure that the i_blocks value is set correctly. Signed-off-by: Aditya Kali <adityakali@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 19:04:51 -04:00
Theodore Ts'o	27baebb849	ext4: tune mballoc's default group prealloc size for bigalloc file systems The default group preallocation size had been previously set to 512 blocks/clusters, regardless of the block/cluster size. This is probably too big for large cluster sizes. So adjust the default so that it is 2 megabytes or 32 clusters, whichever is larger. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 19:02:51 -04:00
Theodore Ts'o	f975d6bcc7	ext4: teach ext4_statfs() to deal with clusters if bigalloc is enabled Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 19:00:51 -04:00
Theodore Ts'o	24aaa8ef4e	ext4: convert the free_blocks field in s_flex_groups to be free_clusters Convert the free_blocks to be free_clusters to make the final revised bigalloc changes easier to read/understand. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:58:51 -04:00
Theodore Ts'o	5704265188	ext4: convert s_{dirty,free}blocks_counter to s_{dirty,free}clusters_counter Convert the percpu counters s_dirtyblocks_counter and s_freeblocks_counter in struct ext4_super_info to be s_dirtyclusters_counter and s_freeclusters_counter. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:56:51 -04:00
Theodore Ts'o	0aa060000e	ext4: teach ext4_ext_truncate() about the bigalloc feature When we are truncating (as opposed unlinking) a file, we need to worry about partial truncates of a file, especially in the light of sparse files. The changes here make sure that arbitrary truncates of sparse files works correctly. Yeah, it's messy. Note that these functions will need to be revisted when the punch ioctl is integrated --- in fact this commit will probably have merge conflicts with the punch changes which Allison Henders and the IBM LTC have been working on. I will need to fix this up when either patch hits mainline. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:54:51 -04:00
Theodore Ts'o	4d33b1ef10	ext4: teach ext4_ext_map_blocks() about the bigalloc feature If we need to allocate a new block in ext4_ext_map_blocks(), the function needs to see if the cluster has already been allocated. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:52:51 -04:00
Theodore Ts'o	84130193e0	ext4: teach ext4_free_blocks() about bigalloc and clusters The ext4_free_blocks() function now has two new flags that indicate whether a partial cluster at the beginning or the end of the block extents should be freed or not. That will be up the caller (i.e., truncate), who can figure out whether partial clusters at the beginning or the end of a block range can be freed. We also have to update the ext4_mb_free_metadata() and release_blocks_on_commit() machinery to be cluster-based, since it is used by ext4_free_blocks(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:50:51 -04:00
Theodore Ts'o	53accfa9f8	ext4: teach mballoc preallocation code about bigalloc clusters In most of mballoc.c, we do everything in units of clusters, since the block allocation bitmaps and buddy bitmaps are all denominated in clusters. The one place where we do deal with absolute block numbers is in the code that handles the preallocation regions, since in the case of inode-based preallocation regions, the start of the preallocation region can't be relative to the beginning of the group. So this adds a bit of complexity, where pa_pstart and pa_lstart are block numbers, while pa_free, pa_len, and fe_len are denominated in units of clusters. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:48:51 -04:00
Linus Torvalds	0d20fbbe82	Merge branch 'for-linus' of git://ceph.newdream.net/git/ceph-client * 'for-linus' of git://ceph.newdream.net/git/ceph-client: libceph: fix leak of osd structs during shutdown ceph: fix memory leak ceph: fix encoding of ino only (not relative) paths libceph: fix msgpool	2011-09-09 15:48:34 -07:00
Theodore Ts'o	3212a80a58	ext4: convert block group-relative offsets to use clusters Certain parts of the ext4 code base, primarily in mballoc.c, use a block group number and offset from the beginning of the block group. This offset is invariably used to index into the allocation bitmap, so change the offset to be denominated in units of clusters. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:46:51 -04:00
Theodore Ts'o	d5b8f31007	ext4: bigalloc changes to block bitmap initialization functions Add bigalloc support to ext4_init_block_bitmap() and ext4_free_blocks_after_init(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:44:51 -04:00
Theodore Ts'o	fd034a84e1	ext4: split out ext4_free_blocks_after_init() The function ext4_free_blocks_after_init() used to be a #define of ext4_init_block_bitmap(). This actually made it difficult to understand how the function worked, and made it hard make changes to support clusters. So as an initial cleanup, I've separated out the functionality of initializing block bitmap from calculating the number of free blocks in the new block group. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:42:51 -04:00
Miklos Szeredi	0ec26fd069	vfs: automount should ignore LOOKUP_FOLLOW Prior to 2.6.38 automount would not trigger on either stat(2) or lstat(2) on the automount point. After 2.6.38, with the introduction of the ->d_automount() infrastructure, stat(2) and others would start triggering automount while lstat(2), etc. still would not. This is a regression and a userspace ABI change. Problem originally reported here: http://thread.gmane.org/gmane.linux.kernel.autofs/6098 It appears that there was an attempt at fixing various userspace tools to not trigger the automount. But since the stat system call is rather common it is impossible to "fix" all userspace. This patch reverts the original behavior, which is to not trigger on stat(2) and other symlink following syscalls. [ It's not really clear what the right behavior is. Apparently Solaris does the "automount on stat, leave alone on lstat". And some programs can get unhappy when "stat+open+fstat" ends up giving a different result from the fstat than from the initial stat. But the change in 2.6.38 resulted in problems for some people, so we're going back to old behavior. Maybe we can re-visit this discussion at some future date - Linus ] Reported-by: Leonardo Chiquitto <leonardo.lists@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Ian Kent <raven@themaw.net> Cc: David Howells <dhowells@redhat.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-09 15:42:34 -07:00
Theodore Ts'o	49f7f9af4b	ext4: factor out block group accounting into functions This makes it easier to understand how ext4_init_block_bitmap() works, and it will assist when we split out ext4_free_blocks_after_init() in the next commit. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:40:51 -04:00
Theodore Ts'o	7137d7a48e	ext4: convert instances of EXT4_BLOCKS_PER_GROUP to EXT4_CLUSTERS_PER_GROUP Change the places in fs/ext4/mballoc.c where EXT4_BLOCKS_PER_GROUP are used to indicate the number of bits in a block bitmap (which is really a cluster allocation bitmap in bigalloc file systems). There are still some places in the ext4 codebase where usage of EXT4_BLOCKS_PER_GROUP needs to be audited/fixed, in code paths that aren't used given the initial restricted assumptions for bigalloc. These will need to be fixed before we can relax those restrictions. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:38:51 -04:00
Theodore Ts'o	bab08ab964	ext4: enforce bigalloc restrictions (e.g., no online resizing, etc.) At least initially if the bigalloc feature is enabled, we will not support non-extent mapped inodes, online resizing, online defrag, or the FITRIM ioctl. This simplifies the initial implementation. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:36:51 -04:00
Theodore Ts'o	281b599597	ext4: read-only support for bigalloc file systems This adds supports for bigalloc file systems. It teaches the mount code just enough about bigalloc superblock fields that it will mount the file system without freaking out that the number of blocks per group is too big. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:34:51 -04:00
Theodore Ts'o	7c2e70879f	ext4: add ext4-specific kludge to avoid an oops after the disk disappears The del_gendisk() function uninitializes the disk-specific data structures, including the bdi structure, without telling anyone else. Once this happens, any attempt to call mark_buffer_dirty() (for example, by ext4_commit_super), will cause a kernel OOPS. Fix this for now until we can fix things in an architecturally correct way. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-09 18:28:51 -04:00
Michal Hocko	a25cac5198	proc: Consider NO_HZ when printing idle and iowait times show_stat handler of the /proc/stat file relies on kstat_cpu(cpu) statistics when priting information about idle and iowait times. This is OK if we are not using tickless kernel (CONFIG_NO_HZ) because counters are updated periodically. With NO_HZ things got more tricky because we are not doing idle/iowait accounting while we are tickless so the value might get outdated. Users of /proc/stat will notice that by unchanged idle/iowait values which is then interpreted as 0% idle/iowait time. From the user space POV this is an unexpected behavior and a change of the interface. Let's fix this by using get_cpu_{idle,iowait}_time_us which accounts the total idle/iowait time since boot and it doesn't rely on sampling or any other periodic activity. Fall back to the previous behavior if NO_HZ is disabled or not configured. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Dave Jones <davej@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Link: http://lkml.kernel.org/r/39181366adac1b39cb6aa3cd53ff0f7c78d32676.1314172057.git.mhocko@suse.cz Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-09-08 11:10:55 +02:00
Linus Torvalds	54d6d53744	Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6 and git://git.infradead.org/ubi-2.6 * branch 'linux-next' of git://git.infradead.org/ubifs-2.6: UBIFS: not build debug messages with CONFIG_UBIFS_FS_DEBUG disabled * branch 'linux-next' of git://git.infradead.org/ubi-2.6: UBI: do not link debug messages when debugging is disabled	2011-09-07 09:51:43 -07:00
J. Bruce Fields	4665e2bac5	nfsd4: split out some free_generic_stateid code We'll use this elsewhere. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-07 09:47:23 -04:00
J. Bruce Fields	fe0750e5c4	nfsd4: split stateowners into open and lockowners The stateowner has some fields that only make sense for openowners, and some that only make sense for lockowners, and I find it a lot clearer if those are separated out. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-07 09:45:49 -04:00
Allison Henderson	02fac1297e	ext4: fix partial page writes While running extended fsx tests to verify the preceeding patches, a similar bug was also found in the write operation When ever a write operation begins or ends in a hole, or extends EOF, the partial page contained in the hole or beyond EOF needs to be zeroed out. To correct this the new ext4_discard_partial_page_buffers_no_lock routine is used to zero out the partial page, but only for buffer heads that are already unmapped. Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-06 21:53:01 -04:00
Allison Henderson	189e868fa8	ext4: fix fsx truncate failure While running extended fsx tests to verify the first two patches, a similar bug was also found in the truncate operation. This bug happens because the truncate routine only zeros the unblock aligned portion of the last page. This means that the block aligned portions of the page appearing after i_size are left unzeroed, and the buffer heads still mapped. This bug is corrected by using ext4_discard_partial_page_buffers in the truncate routine to zero the partial page and unmap the buffer headers. Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-06 21:49:44 -04:00
Jim Garlick	51b8b4fb32	fs/9p: Use protocol-defined value for lock/getlock 'type' field. Signed-off-by: Jim Garlick <garlick@llnl.gov> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2011-09-06 08:17:16 -05:00
Aneesh Kumar K.V	73f507171c	fs/9p: Always ask new inode in lookup for cache mode disabled This make sure we don't end up reusing the unlinked inode object. The ideal way is to use inode i_generation. But i_generation is not available in userspace always. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2011-09-06 08:17:15 -05:00
Aneesh Kumar K.V	f88657ce3f	fs/9p: Add OS dependent open flags in 9p protocol Some of the flags are OS/arch dependent we add a 9p protocol value which maps to asm-generic/fcntl.h values in Linux Based on the original patch from Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>	2011-09-06 08:17:15 -05:00
Aneesh Kumar K.V	45089142b1	fs/9p: Don't update file type when updating file attributes We should only update attributes that we can change on stat2inode. Also do file type initialization in v9fs_init_inode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2011-09-06 08:17:14 -05:00
Aneesh Kumar K.V	5441ae5eb3	fs/9p: Add fid before dentry instantiation d_instantiate marks the dentry positive. So a parallel lookup and mkdir of the directory can find dentry that doesn't have fid attached. This can result in both the code path doing v9fs_fid_add which results in v9fs_dentry leak. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>	2011-09-06 08:17:14 -05:00
Theodore Ts'o	decbd919f4	ext4: only call ext4_jbd2_file_inode when an inode has been extended In delayed allocation mode, it's important to only call ext4_jbd2_file_inode when the file has been extended. This is necessary to avoid a race which first got introduced in commit `678aaf481`, but which was made much more common with the introduction of the "punch hole" functionality. (Especially when dioread_nolock was enabled; when I could reliably reproduce this problem with xfstests #74.) The race is this: If while trying to writeback a delayed allocation inode, there is a need to map delalloc blocks, and we run out of space in the journal, and at the same time the inode is already on the committing transaction's t_inode_list (because for example while doing the punch hole operation, ext4_jbd2_file_inode() is called), then the commit operation will wait for the inode to finish all of its pending writebacks by calling filemap_fdatawait(), but since that inode has one or more pages with the PageWriteback flag set, the commit operation will wait forever, and the so the writeback of the inode can never take place, and the kjournald thread and the writeback thread end up waiting for each other --- forever. It's important at this point to recall why an inode is placed on the t_inode_list; it is to provide the data=ordered guarantees that we don't end up exposing stale data. In the case where we are truncating or punching a hole in the inode, there is no possibility that stale data could be exposed in the first place, so we don't need to put the inode on the t_inode_list! The right long-term fix is to get rid of data=ordered mode altogether, and only update the extent tree or indirect blocks after the data has been written. Until then, this change will also avoid some unnecessary waiting in the commit operation. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Allison Henderson <achender@linux.vnet.ibm.com> Cc: Jan Kara <jack@suse.cz>	2011-09-06 02:37:06 -04:00
Dan Carpenter	d2159fb7b8	jbd2: use gfp_t instead of int This silences some Sparse warnings: fs/jbd2/transaction.c:135:69: warning: incorrect type in argument 2 (different base types) fs/jbd2/transaction.c:135:69: expected restricted gfp_t [usertype] flags fs/jbd2/transaction.c:135:69: got int [signed] gfp_mask Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-04 10:20:14 -04:00
Theodore Ts'o	9ea7a0df63	jbd2: add debugging information to jbd2_journal_dirty_metadata() Add debugging information in case jbd2_journal_dirty_metadata() is called with a buffer_head which didn't have jbd2_journal_get_write_access() called on it, or if the journal_head has the wrong transaction in it. In addition, return an error code. This won't change anything for ocfs2, which will BUG_ON() the non-zero exit code. For ext4, the caller of this function is ext4_handle_dirty_metadata(), and on seeing a non-zero return code, will call __ext4_journal_stop(), which will print the function and line number of the (buggy) calling function and abort the journal. This will allow us to recover instead of bug halting, which is better from a robustness and reliability point of view. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-04 10:18:14 -04:00
J. Bruce Fields	f4dee24cca	nfsd4: move CLOSE_STATE special case to caller Move the CLOSE_STATE case into the unique caller that cares about it rather than putting it in preprocess_seqid_op. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-03 23:15:28 -04:00
Theodore Ts'o	56889787cf	ext4: improve handling of conflicting mount options If the user explicitly specifies conflicting mount options for delalloc or dioread_nolock and data=journal, fail the mount, instead of printing a warning and continuing (since many user's won't look at dmesg and notice the warning). Also, print a single warning that data=journal implies that delayed allocation is not on by default (since it's not supported), and furthermore that O_DIRECT is not supported. Improve the text in Documentation/filesystems/ext4.txt so this is clear there as well. Similarly, if the dioread_nolock mount option is specified when the file system block size != PAGE_SIZE, fail the mount instead of printing a warning message and ignoring the mount option. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-03 18:22:38 -04:00
Allison Henderson	2be4751b21	ext4: fix 2nd xfstests 127 punch hole failure This patch fixes a second punch hole bug found by xfstests 127. This bug happens because punch hole needs to flush the pages of the hole to avoid race conditions. But if the end of the hole is in the same page as i_size, the buffer heads beyond i_size need to be unmapped and the page needs to be zeroed after it is flushed. To correct this, the new ext4_discard_partial_page_buffers routine is used to zero and unmap the partial page beyond i_size if the end of the hole appears in the same page as i_size. The code has also been optimized to set the end of the hole to the page after i_size if the specified hole exceeds i_size, and the code that flushes the pages has been simplified. Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>	2011-09-03 11:56:52 -04:00
Allison Henderson	ba06208a13	ext4: fix xfstests 75, 112, 127 punch hole failure This patch addresses a bug found by xfstests 75, 112, 127 when blocksize = 1k This bug happens because the punch hole code only zeros out non block aligned regions of the page. This means that if the blocks are smaller than a page, then the block aligned regions of the page inside the hole are left un-zeroed, and their buffer heads are still mapped. This bug is corrected by using ext4_discard_partial_page_buffers to properly zero the partial page at the head and tail of the hole, and unmap the corresponding buffer heads This patch also addresses a bug reported by Lukas while working on a new patch to add discard support for loop devices using punch hole. The bug happened because of the first and last block number needed to be cast to a larger data type before calculating the byte offset, but since now we only need the byte offsets of the pages, we no longer even need to be calculating the byte offsets of the blocks. The code to do the block offset calculations is removed in this patch. Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com>	2011-09-03 11:55:59 -04:00
Allison Henderson	4e96b2dbbf	ext4: Add new ext4_discard_partial_page_buffers routines This patch adds two new routines: ext4_discard_partial_page_buffers and ext4_discard_partial_page_buffers_no_lock. The ext4_discard_partial_page_buffers routine is a wrapper function to ext4_discard_partial_page_buffers_no_lock. The wrapper function locks the page and passes it to ext4_discard_partial_page_buffers_no_lock. Calling functions that already have the page locked can call ext4_discard_partial_page_buffers_no_lock directly. The ext4_discard_partial_page_buffers_no_lock function zeros a specified range in a page, and unmaps the corresponding buffer heads. Only block aligned regions of the page will have their buffer heads unmapped. Unblock aligned regions will be mapped if needed so that they can be updated with the partial zero out. This function is meant to be used to update a page and its buffer heads to be zeroed and unmapped when the corresponding blocks have been released or will be released. This routine is used in the following scenarios: * A hole is punched and the non page aligned regions of the head and tail of the hole need to be discarded * The file is truncated and the partial page beyond EOF needs to be discarded * The end of a hole is in the same page as EOF. After the page is flushed, the partial page beyond EOF needs to be discarded. * A write operation begins or ends inside a hole and the partial page appearing before or after the write needs to be discarded * A write operation extends EOF and the partial page beyond EOF needs to be discarded This function takes a flag EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED which is used when a write operation begins or ends in a hole. When the EXT4_DISCARD_PARTIAL_PG_ZERO_UNMAPPED flag is used, only buffer heads that are already unmapped will have the corresponding regions of the page zeroed. Signed-off-by: Allison Henderson <achender@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2011-09-03 11:51:09 -04:00
J. Bruce Fields	68b66e8270	nfsd4: move double-confirm test to open_confirm I don't see the point of having this check in nfs4_preprocess_seqid_op() when it's only needed by the one caller. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-03 05:01:52 -04:00
J. Bruce Fields	77eaae8d44	nfsd4: simplify check_open logic Sometimes the single-exit style is good, sometimes it's unnecessarily convoluted.... Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-02 19:59:29 -04:00
J. Bruce Fields	7a8711c9a6	nfsd4: share common seqid checks Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-02 19:59:24 -04:00
Linus Torvalds	4d7b5a116f	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: fix ->write_inode return values xfs: fix xfs_mark_inode_dirty during umount xfs: deprecate the nodelaylog mount option	2011-09-02 08:25:23 -07:00
J. Bruce Fields	16d259418b	nfsd4: eliminate unused lt_stateowner This is used only as a local variable. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 11:35:30 -04:00
J. Bruce Fields	7c13f344cf	nfsd4: drop most stateowner refcounting Maybe we'll bring it back some day, but we don't have much real use for it now. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 11:12:47 -04:00
Christoph Hellwig	58d84c4ee0	xfs: fix ->write_inode return values Currently we always redirty an inode that was attempted to be written out synchronously but has been cleaned by an AIL pushed internall, which is rather bogus. Fix that by doing the i_update_core check early on and return 0 for it. Also include async calls for it, as doing any work for those is just as pointless. While we're at it also fix the sign for the EIO return in case of a filesystem shutdown, and fix the completely non-sensical locking around xfs_log_inode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com> (cherry picked from commit 297db93bb74cf687510313eb235a7aec14d67e97) Signed-off-by: Alex Elder <aelder@sgi.com>	2011-09-01 09:46:11 -05:00
J. Bruce Fields	fff6ca9cc4	nfsd4: eliminate impossible open replay case If open fails with any error other than nfserr_replay_me, then the main nfsd4_proc_compound() loop continues unconditionally to nfsd4_encode_operation(), which will always call encode_seqid_op_tail. Thus the condition we check for here does not occur. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 07:29:01 -04:00
J. Bruce Fields	5ec094c109	nfsd4: extend state lock over seqid replay logic There are currently a couple races in the seqid replay code: a retransmission could come while we're still encoding the original reply, or a new seqid-mutating call could come as we're encoding a replay. So, extend the state lock over the encoding (both encoding of a replayed reply and caching of the original encoded reply). I really hate doing this, and previously added the stateowner reference-counting code to avoid it (which was insufficient)--but I don't see a less complicated alternative at the moment. Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2011-09-01 07:07:59 -04:00
Christoph Hellwig	866e4ed774	xfs: fix xfs_mark_inode_dirty during umount During umount we do not add a dirty inode to the lru and wait for it to become clean first, but force writeback of data and metadata with I_WILL_FREE set. Currently there is no way for XFS to detect that the inode has been redirtied for metadata operations, as we skip the mark_inode_dirty call during teardown. Fix this by setting i_update_core nanually in that case, so that the inode gets flushed during inode reclaim. Alternatively we could enable calling mark_inode_dirty for inodes in I_WILL_FREE state, and let the VFS dirty tracking handle this. I decided against this as we will get better I/O patterns from reclaim compared to the synchronous writeout in write_inode_now, and always marking the inode dirty in some way from xfs_mark_inode_dirty is a better safetly net in either case. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Alex Elder <aelder@sgi.com> (cherry picked from commit da6742a5a4cc844a9982fdd936ddb537c0747856) Signed-off-by: Alex Elder <aelder@sgi.com>	2011-08-31 17:59:39 -05:00

... 14 15 16 17 18 ...

24897 Commits