Commit Graph

968806 Commits

Author SHA1 Message Date
Chao Yu
d0d83de42f f2fs: compress: support compress level
Expand 'compress_algorithm' mount option to accept parameter as format of
<algorithm>:<level>, by this way, it gives a way to allow user to do more
specified config on lz4 and zstd compression level, then f2fs compression
can provide higher compress ratio.

In order to set compress level for lz4 algorithm, it needs to set
CONFIG_LZ4HC_COMPRESS and CONFIG_F2FS_FS_LZ4HC config to enable lz4hc
compress algorithm.

CR and performance number on lz4/lz4hc algorithm:

dd if=enwik9 of=compressed_file conv=fsync

Original blocks:	244382

			lz4			lz4hc-9
compressed blocks	170647			163270
compress ratio		69.8%			66.8%
speed			16.4207 s, 60.9 MB/s	26.7299 s, 37.4 MB/s

compress ratio = after / before

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-26 23:05:55 -08:00
Chao Yu
8723b85860 f2fs: compress: deny setting unsupported compress algorithm
If kernel doesn't support certain kinds of compress algorithm, deny to set
them as compress algorithm of f2fs via 'compress_algorithm=%s' mount option.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-26 23:05:55 -08:00
Chao Yu
2a17e73984 f2fs: relocate f2fs_precache_extents()
Relocate f2fs_precache_extents() in prior to check_swap_activate(),
then extent cache can be enabled before its use.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-06 14:36:46 -08:00
Chao Yu
3dabc6f5d3 f2fs: enforce the immutable flag on open files
This patch ports commit 02b016ca7f ("ext4: enforce the immutable
flag on open files") to f2fs.

According to the chattr man page, "a file with the 'i' attribute
cannot be modified..."  Historically, this was only enforced when the
file was opened, per the rest of the description, "... and the file
can not be opened in write mode".

There is general agreement that we should standardize all file systems
to prevent modifications even for files that were opened at the time
the immutable flag is set.  Eventually, a change to enforce this at
the VFS layer should be landing in mainline.

Cc: stable@kernel.org
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-06 14:36:46 -08:00
Chao Yu
5e7d71413c f2fs: enhance to update i_mode and acl atomically in f2fs_setattr()
Previously, in f2fs_setattr(), we don't update S_ISUID|S_ISGID|S_ISVTX
bits with S_IRWXUGO bits and acl entries atomically, so in error path,
chmod() may partially success, this patch enhances to make chmod() flow
being atomical.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-06 12:50:52 -08:00
Weichao Guo
dd6d813fa2 f2fs: fix to set inode->i_mode correctly for posix_acl_update_mode
We should update the ~S_IRWXUGO part of inode->i_mode in __setattr_copy,
because posix_acl_update_mode updates mode based on inode->i_mode,
which finally overwrites the ~S_IRWXUGO part of i_acl_mode with old i_mode.

Testcase to reproduce this bug:
0. adduser abc
1. mkfs.f2fs /dev/sdd
2. mount -t f2fs /dev/sdd /mnt/f2fs
3. mkdir /mnt/f2fs/test
4. setfacl -m u🔤r /mnt/f2fs/test
5. chmod +s /mnt/f2fs/test

Signed-off-by: Weichao Guo <guoweichao@oppo.com>
Signed-off-by: Bin Shu <shubin@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-06 12:50:52 -08:00
Zheng Yongjun
14e6dc50ea f2fs: Replace expression with offsetof()
Use the existing offsetof() macro instead of duplicating code.

Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-06 12:50:51 -08:00
Jaegeuk Kim
4ff2fc7ee7 f2fs: handle unallocated section and zone on pinned/atgc
If we have large section/zone, unallocated segment makes them corrupted.

E.g.,

  - Pinned file:       -1 119304647 119304647
  - ATGC   data:       -1 119304647 119304647

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-01-06 12:50:51 -08:00
Eric Biggers
78d21c23dc fs-verity: move structs needed for file signing to UAPI header
Although it isn't used directly by the ioctls,
"struct fsverity_descriptor" is required by userspace programs that need
to compute fs-verity file digests in a standalone way.  Therefore
it's also needed to sign files in a standalone way.

Similarly, "struct fsverity_formatted_digest" (previously called
"struct fsverity_signed_digest" which was misleading) is also needed to
sign files if the built-in signature verification is being used.

Therefore, move these structs to the UAPI header.

While doing this, try to make it clear that the signature-related fields
in fsverity_descriptor aren't used in the file digest computation.

Acked-by: Luca Boccassi <luca.boccassi@microsoft.com>
Link: https://lore.kernel.org/r/20201113211918.71883-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:50:32 -08:00
Eric Biggers
5fe13f59fa fs-verity: rename "file measurement" to "file digest"
I originally chose the name "file measurement" to refer to the fs-verity
file digest to avoid confusion with traditional full-file digests or
with the bare root hash of the Merkle tree.

But the name "file measurement" hasn't caught on, and usually people are
calling it something else, usually the "file digest".  E.g. see
"struct fsverity_digest" and "struct fsverity_formatted_digest", the
libfsverity_compute_digest() and libfsverity_sign_digest() functions in
libfsverity, and the "fsverity digest" command.

Having multiple names for the same thing is always confusing.

So to hopefully avoid confusion in the future, rename
"fs-verity file measurement" to "fs-verity file digest".

This leaves FS_IOC_MEASURE_VERITY as the only reference to "measure" in
the kernel, which makes some amount of sense since the ioctl is actively
"measuring" the file.

I'll be renaming this in fsverity-utils too (though similarly the
'fsverity measure' command, which is a wrapper for
FS_IOC_MEASURE_VERITY, will stay).

Acked-by: Luca Boccassi <luca.boccassi@microsoft.com>
Link: https://lore.kernel.org/r/20201113211918.71883-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:50:31 -08:00
Eric Biggers
7dcf6d6887 fs-verity: rename fsverity_signed_digest to fsverity_formatted_digest
The name "struct fsverity_signed_digest" is causing confusion because it
isn't actually a signed digest, but rather it's the way that the digest
is formatted in order to be signed.  Rename it to
"struct fsverity_formatted_digest" to prevent this confusion.

Also update the struct's comment to clarify that it's specific to the
built-in signature verification support and isn't a requirement for all
fs-verity users.

I'll be renaming this struct in fsverity-utils too.

Acked-by: Luca Boccassi <luca.boccassi@microsoft.com>
Link: https://lore.kernel.org/r/20201113211918.71883-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:50:31 -08:00
Eric Biggers
c864727cb6 fs-verity: remove filenames from file comments
Embedding the file path inside kernel source code files isn't
particularly useful as often files are moved around and the paths become
incorrect.  checkpatch.pl warns about this since v5.10-rc1.

Acked-by: Luca Boccassi <luca.boccassi@microsoft.com>
Link: https://lore.kernel.org/r/20201113211918.71883-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:50:31 -08:00
Eric Biggers
a7359960b6 fscrypt: allow deleting files with unsupported encryption policy
Currently it's impossible to delete files that use an unsupported
encryption policy, as the kernel will just return an error when
performing any operation on the top-level encrypted directory, even just
a path lookup into the directory or opening the directory for readdir.

More specifically, this occurs in any of the following cases:

- The encryption context has an unrecognized version number.  Current
  kernels know about v1 and v2, but there could be more versions in the
  future.

- The encryption context has unrecognized encryption modes
  (FSCRYPT_MODE_*) or flags (FSCRYPT_POLICY_FLAG_*), an unrecognized
  combination of modes, or reserved bits set.

- The encryption key has been added and the encryption modes are
  recognized but aren't available in the crypto API -- for example, a
  directory is encrypted with FSCRYPT_MODE_ADIANTUM but the kernel
  doesn't have CONFIG_CRYPTO_ADIANTUM enabled.

It's desirable to return errors for most operations on files that use an
unsupported encryption policy, but the current behavior is too strict.
We need to allow enough to delete files, so that people can't be stuck
with undeletable files when downgrading kernel versions.  That includes
allowing directories to be listed and allowing dentries to be looked up.

Fix this by modifying the key setup logic to treat an unsupported
encryption policy in the same way as "key unavailable" in the cases that
are required for a recursive delete to work: preparing for a readdir or
a dentry lookup, revalidating a dentry, or checking whether an inode has
the same encryption policy as its parent directory.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:42 -08:00
Eric Biggers
94bb3de392 fscrypt: unexport fscrypt_get_encryption_info()
Now that fscrypt_get_encryption_info() is only called from files in
fs/crypto/ (due to all key setup now being handled by higher-level
helper functions instead of directly by filesystems), unexport it and
move its declaration to fscrypt_private.h.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:41 -08:00
Eric Biggers
0f25cf6b92 fscrypt: move fscrypt_require_key() to fscrypt_private.h
fscrypt_require_key() is now only used by files in fs/crypto/.  So
reduce its visibility to fscrypt_private.h.  This is also a prerequsite
for unexporting fscrypt_get_encryption_info().

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:41 -08:00
Eric Biggers
9aa0e0ab0d fscrypt: move body of fscrypt_prepare_setattr() out-of-line
In preparation for reducing the visibility of fscrypt_require_key() by
moving it to fscrypt_private.h, move the call to it from
fscrypt_prepare_setattr() to an out-of-line function.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:41 -08:00
Eric Biggers
4af7c34103 fscrypt: introduce fscrypt_prepare_readdir()
The last remaining use of fscrypt_get_encryption_info() from filesystems
is for readdir (->iterate_shared()).  Every other call is now in
fs/crypto/ as part of some other higher-level operation.

We need to add a new argument to fscrypt_get_encryption_info() to
indicate whether the encryption policy is allowed to be unrecognized or
not.  Doing this is easier if we can work with high-level operations
rather than direct filesystem use of fscrypt_get_encryption_info().

So add a function fscrypt_prepare_readdir() which wraps the call to
fscrypt_get_encryption_info() for the readdir use case.

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:41 -08:00
Eric Biggers
7081792afe ext4: don't call fscrypt_get_encryption_info() from dx_show_leaf()
The call to fscrypt_get_encryption_info() in dx_show_leaf() is too low
in the call tree; fscrypt_get_encryption_info() should have already been
called when starting the directory operation.  And indeed, it already
is.  Moreover, the encryption key is guaranteed to already be available
because dx_show_leaf() is only called when adding a new directory entry.

And even if the key wasn't available, dx_show_leaf() uses
fscrypt_fname_disk_to_usr() which knows how to create a no-key name.

So for the above reasons, and because it would be desirable to stop
exporting fscrypt_get_encryption_info() directly to filesystems, remove
the call to fscrypt_get_encryption_info() from dx_show_leaf().

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:40 -08:00
Eric Biggers
c247d00ee0 ubifs: remove ubifs_dir_open()
Since encrypted directories can be opened and searched without their key
being available, and each readdir and ->lookup() tries to set up the
key, trying to set up the key in ->open() too isn't really useful.

Just remove it so that directories don't need an ->open() method
anymore, and so that we eliminate a use of fscrypt_get_encryption_info()
(which I'd like to stop exporting to filesystems).

Link: https://lore.kernel.org/r/20201203022041.230976-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:40 -08:00
Eric Biggers
39eea38802 f2fs: remove f2fs_dir_open()
Since encrypted directories can be opened and searched without their key
being available, and each readdir and ->lookup() tries to set up the
key, trying to set up the key in ->open() too isn't really useful.

Just remove it so that directories don't need an ->open() method
anymore, and so that we eliminate a use of fscrypt_get_encryption_info()
(which I'd like to stop exporting to filesystems).

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20201203022041.230976-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:40 -08:00
Eric Biggers
757f7f1302 ext4: remove ext4_dir_open()
Since encrypted directories can be opened and searched without their key
being available, and each readdir and ->lookup() tries to set up the
key, trying to set up the key in ->open() too isn't really useful.

Just remove it so that directories don't need an ->open() method
anymore, and so that we eliminate a use of fscrypt_get_encryption_info()
(which I'd like to stop exporting to filesystems).

Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:40 -08:00
Eric Biggers
3b743b0063 fscrypt: simplify master key locking
The stated reasons for separating fscrypt_master_key::mk_secret_sem from
the standard semaphore contained in every 'struct key' no longer apply.

First, due to commit a992b20cd4 ("fscrypt: add
fscrypt_prepare_new_inode() and fscrypt_set_context()"),
fscrypt_get_encryption_info() is no longer called from within a
filesystem transaction.

Second, due to commit d3ec10aa95 ("KEYS: Don't write out to userspace
while holding key semaphore"), the semaphore for the "keyring" key type
no longer ranks above page faults.

That leaves performance as the only possible reason to keep the separate
mk_secret_sem.  Specifically, having mk_secret_sem reduces the
contention between setup_file_encryption_key() and
FS_IOC_{ADD,REMOVE}_ENCRYPTION_KEY.  However, these ioctls aren't
executed often, so this doesn't seem to be worth the extra complexity.

Therefore, simplify the locking design by just using key->sem instead of
mk_secret_sem.

Link: https://lore.kernel.org/r/20201117032626.320275-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:39 -08:00
Eric Biggers
26461e30a7 fscrypt: remove unnecessary calls to fscrypt_require_key()
In an encrypted directory, a regular dentry (one that doesn't have the
no-key name flag) can only be created if the directory's encryption key
is available.

Therefore the calls to fscrypt_require_key() in __fscrypt_prepare_link()
and __fscrypt_prepare_rename() are unnecessary, as these functions
already check that the dentries they're given aren't no-key names.

Remove these unnecessary calls to fscrypt_require_key().

Link: https://lore.kernel.org/r/20201118075609.120337-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:39 -08:00
Eric Biggers
cfdfa25f08 ubifs: prevent creating duplicate encrypted filenames
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.

Fix this bug on ubifs by rejecting no-key dentries in ubifs_create(),
ubifs_mkdir(), ubifs_mknod(), and ubifs_symlink().

Note that ubifs doesn't actually report the duplicate filenames from
readdir, but rather it seems to replace the original dentry with a new
one (which is still wrong, just a different effect from ext4).

On ubifs, this fixes xfstest generic/595 as well as the new xfstest I
wrote specifically for this bug.

Fixes: f4f61d2cc6 ("ubifs: Implement encrypted filenames")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:39 -08:00
Eric Biggers
ee957e5d92 f2fs: prevent creating duplicate encrypted filenames
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.

Fix this bug on f2fs by rejecting no-key dentries in f2fs_add_link().

Note that the weird check for the current task in f2fs_do_add_link()
seems to make this bug difficult to reproduce on f2fs.

Fixes: 9ea97163c6 ("f2fs crypto: add filename encryption for f2fs_add_link")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:39 -08:00
Eric Biggers
e09b02e8c6 ext4: prevent creating duplicate encrypted filenames
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.

Fix this bug on ext4 by rejecting no-key dentries in ext4_add_entry().

Note that the duplicate check in ext4_find_dest_de() sometimes prevented
this bug.  However in many cases it didn't, since ext4_find_dest_de()
doesn't examine every dentry.

Fixes: 4461471107 ("ext4 crypto: enable filename encryption")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:38 -08:00
Eric Biggers
6b764a0e30 fscrypt: add fscrypt_is_nokey_name()
It's possible to create a duplicate filename in an encrypted directory
by creating a file concurrently with adding the encryption key.

Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or
sys_symlink()) can lookup the target filename while the directory's
encryption key hasn't been added yet, resulting in a negative no-key
dentry.  The VFS then calls ->create() (or ->mkdir(), ->mknod(), or
->symlink()) because the dentry is negative.  Normally, ->create() would
return -ENOKEY due to the directory's key being unavailable.  However,
if the key was added between the dentry lookup and ->create(), then the
filesystem will go ahead and try to create the file.

If the target filename happens to already exist as a normal name (not a
no-key name), a duplicate filename may be added to the directory.

In order to fix this, we need to fix the filesystems to prevent
->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names.
(->rename() and ->link() need it too, but those are already handled
correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)

In preparation for this, add a helper function fscrypt_is_nokey_name()
that filesystems can use to do this check.  Use this helper function for
the existing checks that fs/crypto/ does for rename and link.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:38 -08:00
Eric Biggers
9eb2976da7 fscrypt: remove kernel-internal constants from UAPI header
There isn't really any valid reason to use __FSCRYPT_MODE_MAX or
FSCRYPT_POLICY_FLAGS_VALID in a userspace program.  These constants are
only meant to be used by the kernel internally, and they are defined in
the UAPI header next to the mode numbers and flags only so that kernel
developers don't forget to update them when adding new modes or flags.

In https://lkml.kernel.org/r/20201005074133.1958633-2-satyat@google.com
there was an example of someone wanting to use __FSCRYPT_MODE_MAX in a
user program, and it was wrong because the program would have broken if
__FSCRYPT_MODE_MAX were ever increased.  So having this definition
available is harmful.  FSCRYPT_POLICY_FLAGS_VALID has the same problem.

So, remove these definitions from the UAPI header.  Replace
FSCRYPT_POLICY_FLAGS_VALID with just listing the valid flags explicitly
in the one kernel function that needs it.  Move __FSCRYPT_MODE_MAX to
fscrypt_private.h, remove the double underscores (which were only
present to discourage use by userspace), and add a BUILD_BUG_ON() and
comments to (hopefully) ensure it is kept in sync.

Keep the old name FS_POLICY_FLAGS_VALID, since it's been around for
longer and there's a greater chance that removing it would break source
compatibility with some program.  Indeed, mtd-utils is using it in
an #ifdef, and removing it would introduce compiler warnings (about
FS_POLICY_FLAGS_PAD_* being redefined) into the mtd-utils build.
However, reduce its value to 0x07 so that it only includes the flags
with old names (the ones present before Linux 5.4), and try to make it
clear that it's now "frozen" and no new flags should be added to it.

Fixes: 2336d0deb2 ("fscrypt: use FSCRYPT_ prefix for uapi constants")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20201024005132.495952-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2020-12-28 14:49:38 -08:00
Chao Yu
3ed3a9407a f2fs: compress: fix compression chksum
This patch addresses minor issues in compression chksum.

Fixes: b28f047b28c5 ("f2fs: compress: support chksum")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:20 -08:00
Chao Yu
de3e65bbe3 f2fs: fix shift-out-of-bounds in sanity_check_raw_super()
syzbot reported a bug which could cause shift-out-of-bounds issue,
fix it.

Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:120
 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
 sanity_check_raw_super fs/f2fs/super.c:2812 [inline]
 read_raw_super_block fs/f2fs/super.c:3267 [inline]
 f2fs_fill_super.cold+0x16c9/0x16f6 fs/f2fs/super.c:3519
 mount_bdev+0x34d/0x410 fs/super.c:1366
 legacy_get_tree+0x105/0x220 fs/fs_context.c:592
 vfs_get_tree+0x89/0x2f0 fs/super.c:1496
 do_new_mount fs/namespace.c:2896 [inline]
 path_mount+0x12ae/0x1e70 fs/namespace.c:3227
 do_mount fs/namespace.c:3240 [inline]
 __do_sys_mount fs/namespace.c:3448 [inline]
 __se_sys_mount fs/namespace.c:3425 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3425
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: syzbot+ca9a785f8ac472085994@syzkaller.appspotmail.com
Signed-off-by: Anant Thazhemadam <anant.thazhemadam@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:19 -08:00
Daeho Jeong
d4ea8d8b67 f2fs: fix race of pending_pages in decompression
I found out f2fs_free_dic() is invoked in a wrong timing, but
f2fs_verify_bio() still needed the dic info and it triggered the
below kernel panic. It has been caused by the race condition of
pending_pages value between decompression and verity logic, when
the same compression cluster had been split in different bios.
By split bios, f2fs_verify_bio() ended up with decreasing
pending_pages value before it is reset to nr_cpages by
f2fs_decompress_pages() and caused the kernel panic.

[ 4416.564763] Unable to handle kernel NULL pointer dereference
               at virtual address 0000000000000000
...
[ 4416.896016] Workqueue: fsverity_read_queue f2fs_verity_work
[ 4416.908515] pc : fsverity_verify_page+0x20/0x78
[ 4416.913721] lr : f2fs_verify_bio+0x11c/0x29c
[ 4416.913722] sp : ffffffc019533cd0
[ 4416.913723] x29: ffffffc019533cd0 x28: 0000000000000402
[ 4416.913724] x27: 0000000000000001 x26: 0000000000000100
[ 4416.913726] x25: 0000000000000001 x24: 0000000000000004
[ 4416.913727] x23: 0000000000001000 x22: 0000000000000000
[ 4416.913728] x21: 0000000000000000 x20: ffffffff2076f9c0
[ 4416.913729] x19: ffffffff2076f9c0 x18: ffffff8a32380c30
[ 4416.913731] x17: ffffffc01f966d97 x16: 0000000000000298
[ 4416.913732] x15: 0000000000000000 x14: 0000000000000000
[ 4416.913733] x13: f074faec89ffffff x12: 0000000000000000
[ 4416.913734] x11: 0000000000001000 x10: 0000000000001000
[ 4416.929176] x9 : ffffffff20d1f5c7 x8 : 0000000000000000
[ 4416.929178] x7 : 626d7464ff286b6b x6 : ffffffc019533ade
[ 4416.929179] x5 : 000000008049000e x4 : ffffffff2793e9e0
[ 4416.929180] x3 : 000000008049000e x2 : ffffff89ecfa74d0
[ 4416.929181] x1 : 0000000000000c40 x0 : ffffffff2076f9c0
[ 4416.929184] Call trace:
[ 4416.929187]  fsverity_verify_page+0x20/0x78
[ 4416.929189]  f2fs_verify_bio+0x11c/0x29c
[ 4416.929192]  f2fs_verity_work+0x58/0x84
[ 4417.050667]  process_one_work+0x270/0x47c
[ 4417.055354]  worker_thread+0x27c/0x4d8
[ 4417.059784]  kthread+0x13c/0x320
[ 4417.063693]  ret_from_fork+0x10/0x18

Chao pointed this can happen by the below race condition.

Thread A        f2fs_post_read_wq          fsverity_wq
- f2fs_read_multi_pages()
  - f2fs_alloc_dic
   - dic->pending_pages = 2
   - submit_bio()
   - submit_bio()
               - f2fs_post_read_work() handle first bio
                - f2fs_decompress_work()
                 - __read_end_io()
                  - f2fs_decompress_pages()
                   - dic->pending_pages--
                - enqueue f2fs_verity_work()
                                           - f2fs_verity_work() handle first bio
                                            - f2fs_verify_bio()
                                             - dic->pending_pages--
               - f2fs_post_read_work() handle second bio
                - f2fs_decompress_work()
                - enqueue f2fs_verity_work()
                                            - f2fs_verify_pages()
                                            - f2fs_free_dic()

                                          - f2fs_verity_work() handle second bio
                                           - f2fs_verfy_bio()
                                                 - use-after-free on dic

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:19 -08:00
Chao Yu
b91c6825c3 f2fs: fix to account inline xattr correctly during recovery
During recovery, we may missed to update inline xattr count correctly,
fix it.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:19 -08:00
Jack Qiu
b0858a20fe f2fs: inline: fix wrong inline inode stat
Miss to stat inline inode in f2fs_recover_inline_data.

Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:19 -08:00
Jack Qiu
fb7f57b395 f2fs: inline: correct comment in f2fs_recover_inline_data
In 3rd scene, it should remove data blocks instead of inline_data.

Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:18 -08:00
Yangtao Li
3ee0967633 f2fs: don't check PAGE_SIZE again in sanity_check_raw_super()
Many flash devices read and write a single IO based on a multiple
of 4KB, and we support only 4KB page cache size now.

Since we already check page size in init_f2fs_fs(), so remove page
size check in sanity_check_raw_super().

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Shaohua Liu <liush@allwinnertech.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:18 -08:00
Yangtao Li
d2aa76bd14 f2fs: convert to F2FS_*_INO macro
Use F2FS_ROOT_INO, F2FS_NODE_INO and F2FS_META_INO macro
for better code readability.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Shaohua Liu <liush@allwinnertech.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:18 -08:00
Jaegeuk Kim
160d75f489 f2fs: introduce max_io_bytes, a sysfs entry, to limit bio size
This patch adds max_io_bytes to limit bio size when f2fs tries to merge
consecutive IOs. This can give a testing point to split out bios and check
end_io handles those bios correctly. This is used to capture a recent bug
on the decompression and fsverity flow.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:17 -08:00
Jaegeuk Kim
be33871db7 f2fs: don't allow any writes on readonly mount
generic_make_request: Trying to write to read-only block-device dm-5 (partno 0)
WARNING: CPU: 7 PID: 546 at block/blk-core.c:2190 generic_make_request_checks+0x664/0x690
pc : generic_make_request_checks+0x664/0x690
lr : generic_make_request_checks+0x664/0x690
Call trace:
 generic_make_request_checks+0x664/0x690
 generic_make_request+0xf0/0x3a4
 submit_bio+0x80/0x250
 __submit_merged_bio+0x368/0x4e0
 __submit_merged_write_cond.llvm.12294350193007536502+0xe0/0x3e8
 f2fs_wait_on_page_writeback+0x84/0x128
 f2fs_convert_inline_page+0x35c/0x6f8
 f2fs_convert_inline_inode+0xe0/0x2e0
 f2fs_file_mmap+0x48/0x9c
 mmap_region+0x41c/0x74c
 do_mmap+0x40c/0x4fc
 vm_mmap_pgoff+0xb8/0x114
 vm_mmap+0x34/0x48
 elf_map+0x68/0x108
 load_elf_binary+0x538/0xb70
 search_binary_handler+0xac/0x1dc
 exec_binprm+0x50/0x15c
 __do_execve_file+0x620/0x740
 __arm64_sys_execve+0x54/0x68
 el0_svc_common+0x9c/0x168
 el0_svc_handler+0x60/0x6c
 el0_svc+0x8/0xc

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:17 -08:00
Jaegeuk Kim
317c4f58d6 f2fs: avoid race condition for shrinker count
Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
wrong do_shinker work. Let's avoid to return insane overflowed value by adding
single tracking value.

Reported-by: Light Hsieh <Light.Hsieh@mediatek.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:17 -08:00
Daeho Jeong
46806a8d82 f2fs: add F2FS_IOC_DECOMPRESS_FILE and F2FS_IOC_COMPRESS_FILE
Added two ioctl to decompress/compress explicitly the compression
enabled file in "compress_mode=user" mount option.

Using these two ioctls, the users can make a control of compression
and decompression of their files.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:17 -08:00
Daeho Jeong
f15a200ae8 f2fs: add compress_mode mount option
We will add a new "compress_mode" mount option to control file
compression mode. This supports "fs" and "user". In "fs" mode (default),
f2fs does automatic compression on the compression enabled files.
In "user" mode, f2fs disables the automaic compression and gives the
user discretion of choosing the target file and the timing. It means
the user can do manual compression/decompression on the compression
enabled files using ioctls.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:16 -08:00
Yangtao Li
516758ca20 f2fs: Remove unnecessary unlikely()
WARN_ON() already contains an unlikely(), so it's not necessary
to use unlikely.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Shuosheng Huang <huangshuosheng@allwinnertech.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:16 -08:00
Jack Qiu
d86ac27d37 f2fs: init dirty_secmap incorrectly
section is dirty, but dirty_secmap may not set

Reported-by: Jia Yang <jiayang5@huawei.com>
Fixes: da52f8ade4 ("f2fs: get the right gc victim section when section has several segments")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jack Qiu <jack.qiu@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:16 -08:00
Jaegeuk Kim
ecc8b5c902 f2fs: remove buffer_head which has 32bits limit
This patch removes buffer_head dependency when getting block addresses.
Light reported there's a 32bit issue in f2fs_fiemap where map_bh.b_size is
32bits while len is 64bits given by user. This will give wrong length to
f2fs_map_block.

Reported-by: Light Hsieh <Light.Hsieh@mediatek.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:16 -08:00
Jaegeuk Kim
d763d7a72c f2fs: fix wrong block count instead of bytes
We should convert cur_lblock, a block count, to bytes for len.

Fixes: af4b6b8edf ("f2fs: introduce check_swap_activate_fast()")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:15 -08:00
Jaegeuk Kim
366d9ab986 f2fs: use new conversion functions between blks and bytes
This patch cleans up blks and bytes conversions.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:15 -08:00
Jaegeuk Kim
1c870991f5 f2fs: rename logical_to_blk and blk_to_logical
This patch renames two functions like below having u64.
 - logical_to_blk to bytes_to_blks
 - blk_to_logical to blks_to_bytes

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:15 -08:00
Chao Yu
5a11975441 f2fs: fix kbytes written stat for multi-device case
For multi-device case, one f2fs image includes multi devices, so it
needs to account bytes written of all block devices belong to the image
rather than one main block device, fix it.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:15 -08:00
Chao Yu
1f004d62f8 f2fs: compress: support chksum
This patch supports to store chksum value with compressed
data, and verify the integrality of compressed data while
reading the data.

The feature can be enabled through specifying mount option
'compress_chksum'.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:14 -08:00
Chao Yu
34315f8872 f2fs: fix to avoid REQ_TIME and CP_TIME collision
Lei Li reported a issue: if foreground operations are frequent, background
checkpoint may be always skipped due to below check, result in losing more
data after sudden power-cut.

f2fs_balance_fs_bg()
...
	if (!is_idle(sbi, REQ_TIME) &&
		(!excess_dirty_nats(sbi) && !excess_dirty_nodes(sbi)))
		return;

E.g:
cp_interval = 5 second
idle_interval = 2 second
foreground operation interval = 1 second (append 1 byte per second into file)

In such case, no matter when it calls f2fs_balance_fs_bg(), is_idle(, REQ_TIME)
returns false, result in skipping background checkpoint.

This patch changes as below to make trigger condition being more reasonable:
- trigger sync_fs() if dirty_{nats,nodes} and prefree segs exceeds threshold;
- skip triggering sync_fs() if there is any background inflight IO or there is
foreground operation recently and meanwhile cp_rwsem is being held by someone;

Reported-by: Lei Li <noctis.akm@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2020-12-21 13:33:14 -08:00