xfs: merge fsync and O_SYNC handling
The guarantees for O_SYNC are exactly the same as the ones we need to
make for an fsync call (and given that Linux O_SYNC is O_DSYNC the
equivalent is fdadatasync, but we treat both the same in XFS), except
with a range data writeout. Jan Kara has started unifying these two
path for filesystems using the generic helpers, and I've started to
look at XFS.
The actual transaction commited by xfs_fsync and xfs_write_sync_logforce
has a different transaction number, but actually is exactly the same.
We'll only use the fsync transaction going forward. One major difference
is that xfs_write_sync_logforce never issues a cache flush unless we
commit a transaction causing that as a side-effect, which is an obvious
bug in the O_SYNC handling. Second all the locking and i_update_size
vs i_update_core changes from 978b723712
never made it to xfs_write_sync_logforce, so we add them back.
To make xfs_fsync easily usable from the O_SYNC path, the filemap_fdatawait
call is moved up to xfs_file_fsync, so that we don't wait on the whole
file after we already waited for our portion in xfs_write.
We'll also use a plain call to filemap_write_and_wait_range instead
of the previous sync_page_rang which did it in two steps including
an half-hearted inode write out that doesn't help us.
Once we're done with this also remove the now useless i_update_size
tracking.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
This commit is contained in:

committed by
Felix Blyakher

parent
bd16956599
commit
13e6d5cdde
@@ -87,90 +87,6 @@ xfs_write_clear_setuid(
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* Handle logging requirements of various synchronous types of write.
|
||||
*/
|
||||
int
|
||||
xfs_write_sync_logforce(
|
||||
xfs_mount_t *mp,
|
||||
xfs_inode_t *ip)
|
||||
{
|
||||
int error = 0;
|
||||
|
||||
/*
|
||||
* If we're treating this as O_DSYNC and we have not updated the
|
||||
* size, force the log.
|
||||
*/
|
||||
if (!(mp->m_flags & XFS_MOUNT_OSYNCISOSYNC) &&
|
||||
!(ip->i_update_size)) {
|
||||
xfs_inode_log_item_t *iip = ip->i_itemp;
|
||||
|
||||
/*
|
||||
* If an allocation transaction occurred
|
||||
* without extending the size, then we have to force
|
||||
* the log up the proper point to ensure that the
|
||||
* allocation is permanent. We can't count on
|
||||
* the fact that buffered writes lock out direct I/O
|
||||
* writes - the direct I/O write could have extended
|
||||
* the size nontransactionally, then finished before
|
||||
* we started. xfs_write_file will think that the file
|
||||
* didn't grow but the update isn't safe unless the
|
||||
* size change is logged.
|
||||
*
|
||||
* Force the log if we've committed a transaction
|
||||
* against the inode or if someone else has and
|
||||
* the commit record hasn't gone to disk (e.g.
|
||||
* the inode is pinned). This guarantees that
|
||||
* all changes affecting the inode are permanent
|
||||
* when we return.
|
||||
*/
|
||||
if (iip && iip->ili_last_lsn) {
|
||||
error = _xfs_log_force(mp, iip->ili_last_lsn,
|
||||
XFS_LOG_FORCE | XFS_LOG_SYNC, NULL);
|
||||
} else if (xfs_ipincount(ip) > 0) {
|
||||
error = _xfs_log_force(mp, (xfs_lsn_t)0,
|
||||
XFS_LOG_FORCE | XFS_LOG_SYNC, NULL);
|
||||
}
|
||||
|
||||
} else {
|
||||
xfs_trans_t *tp;
|
||||
|
||||
/*
|
||||
* O_SYNC or O_DSYNC _with_ a size update are handled
|
||||
* the same way.
|
||||
*
|
||||
* If the write was synchronous then we need to make
|
||||
* sure that the inode modification time is permanent.
|
||||
* We'll have updated the timestamp above, so here
|
||||
* we use a synchronous transaction to log the inode.
|
||||
* It's not fast, but it's necessary.
|
||||
*
|
||||
* If this a dsync write and the size got changed
|
||||
* non-transactionally, then we need to ensure that
|
||||
* the size change gets logged in a synchronous
|
||||
* transaction.
|
||||
*/
|
||||
tp = xfs_trans_alloc(mp, XFS_TRANS_WRITE_SYNC);
|
||||
if ((error = xfs_trans_reserve(tp, 0,
|
||||
XFS_SWRITE_LOG_RES(mp),
|
||||
0, 0, 0))) {
|
||||
/* Transaction reserve failed */
|
||||
xfs_trans_cancel(tp, 0);
|
||||
} else {
|
||||
/* Transaction reserve successful */
|
||||
xfs_ilock(ip, XFS_ILOCK_EXCL);
|
||||
xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
|
||||
xfs_trans_ihold(tp, ip);
|
||||
xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
|
||||
xfs_trans_set_sync(tp);
|
||||
error = xfs_trans_commit(tp, 0);
|
||||
xfs_iunlock(ip, XFS_ILOCK_EXCL);
|
||||
}
|
||||
}
|
||||
|
||||
return error;
|
||||
}
|
||||
|
||||
/*
|
||||
* Force a shutdown of the filesystem instantly while keeping
|
||||
* the filesystem consistent. We don't do an unmount here; just shutdown
|
||||
|
Reference in New Issue
Block a user