btrfs: convert snapshot/nocow exlcusion to drew lock

This patch removes all haphazard code implementing nocow writers
exclusion from pending snapshot creation and switches to using the drew
lock to ensure this invariant still holds.

'Readers' are snapshot creators from create_snapshot and 'writers' are
nocow writers from buffered write path or btrfs_setsize. This locking
scheme allows for multiple snapshots to happen while any nocow writers
are blocked, since writes to page cache in the nocow path will make
snapshots inconsistent.

So for performance reasons we'd like to have the ability to run multiple
concurrent snapshots and also favors readers in this case. And in case
there aren't pending snapshots (which will be the majority of the cases)
we rely on the percpu's writers counter to avoid cacheline contention.

The main gain from using the drew lock is it's now a lot easier to
reason about the guarantees of the locking scheme and whether there is
some silent breakage lurking.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
Nikolay Borisov
2020-01-30 14:59:45 +02:00
committed by David Sterba
parent 2992df7326
commit dcc3eb9638
6 changed files with 25 additions and 103 deletions

View File

@@ -1553,8 +1553,7 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos,
u64 num_bytes;
int ret;
ret = btrfs_start_write_no_snapshotting(root);
if (!ret)
if (!btrfs_drew_try_write_lock(&root->snapshot_lock))
return -EAGAIN;
lockstart = round_down(pos, fs_info->sectorsize);
@@ -1569,7 +1568,7 @@ static noinline int check_can_nocow(struct btrfs_inode *inode, loff_t pos,
NULL, NULL, NULL);
if (ret <= 0) {
ret = 0;
btrfs_end_write_no_snapshotting(root);
btrfs_drew_write_unlock(&root->snapshot_lock);
} else {
*write_bytes = min_t(size_t, *write_bytes ,
num_bytes - pos + lockstart);
@@ -1675,7 +1674,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
data_reserved, pos,
write_bytes);
else
btrfs_end_write_no_snapshotting(root);
btrfs_drew_write_unlock(&root->snapshot_lock);
break;
}
@@ -1779,7 +1778,7 @@ again:
release_bytes = 0;
if (only_release_metadata)
btrfs_end_write_no_snapshotting(root);
btrfs_drew_write_unlock(&root->snapshot_lock);
if (only_release_metadata && copied > 0) {
lockstart = round_down(pos,
@@ -1808,7 +1807,7 @@ again:
if (release_bytes) {
if (only_release_metadata) {
btrfs_end_write_no_snapshotting(root);
btrfs_drew_write_unlock(&root->snapshot_lock);
btrfs_delalloc_release_metadata(BTRFS_I(inode),
release_bytes, true);
} else {