ext4: abort the filesystem if failed to async write metadata buffer

There is a risk of filesystem inconsistency if we failed to async write
back metadata buffer in the background. Because of current buffer's end
io procedure is handled by end_buffer_async_write() in the block layer,
and it only clear the buffer's uptodate flag and mark the write_io_error
flag, so ext4 cannot detect such failure immediately. In most cases of
getting metadata buffer (e.g. ext4_read_inode_bitmap()), although the
buffer's data is actually uptodate, it may still read data from disk
because the buffer's uptodate flag has been cleared. Finally, it may
lead to on-disk filesystem inconsistency if reading old data from the
disk successfully and write them out again.

This patch detect bdev mapping->wb_err when getting journal's write
access and mark the filesystem error if bdev's mapping->wb_err was
increased, this could prevent further writing and potential
inconsistency.

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Suggested-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20200620025427.1756360-2-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This commit is contained in:
zhangyi (F)
2020-06-20 10:54:23 +08:00
committed by Theodore Ts'o
parent c1d2c7d47e
commit bc71726c72
3 changed files with 45 additions and 0 deletions

View File

@@ -4765,6 +4765,15 @@ no_journal:
}
#endif /* CONFIG_QUOTA */
/*
* Save the original bdev mapping's wb_err value which could be
* used to detect the metadata async write error.
*/
spin_lock_init(&sbi->s_bdev_wb_lock);
if (!sb_rdonly(sb))
errseq_check_and_advance(&sb->s_bdev->bd_inode->i_mapping->wb_err,
&sbi->s_bdev_wb_err);
sb->s_bdev->bd_super = sb;
EXT4_SB(sb)->s_mount_state |= EXT4_ORPHAN_FS;
ext4_orphan_cleanup(sb, es);
EXT4_SB(sb)->s_mount_state &= ~EXT4_ORPHAN_FS;
@@ -5654,6 +5663,14 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data)
goto restore_opts;
}
/*
* Update the original bdev mapping's wb_err value
* which could be used to detect the metadata async
* write error.
*/
errseq_check_and_advance(&sb->s_bdev->bd_inode->i_mapping->wb_err,
&sbi->s_bdev_wb_err);
/*
* Mounting a RDONLY partition read-write, so reread
* and store the current valid flag. (It may have