Expand 'compress_algorithm' mount option to accept parameter as format of
<algorithm>:<level>, by this way, it gives a way to allow user to do more
specified config on lz4 and zstd compression level, then f2fs compression
can provide higher compress ratio.
In order to set compress level for lz4 algorithm, it needs to set
CONFIG_LZ4HC_COMPRESS and CONFIG_F2FS_FS_LZ4HC config to enable lz4hc
compress algorithm.
CR and performance number on lz4/lz4hc algorithm:
dd if=enwik9 of=compressed_file conv=fsync
Original blocks: 244382
lz4 lz4hc-9
compressed blocks 170647 163270
compress ratio 69.8% 66.8%
speed 16.4207 s, 60.9 MB/s 26.7299 s, 37.4 MB/s
compress ratio = after / before
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If kernel doesn't support certain kinds of compress algorithm, deny to set
them as compress algorithm of f2fs via 'compress_algorithm=%s' mount option.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Relocate f2fs_precache_extents() in prior to check_swap_activate(),
then extent cache can be enabled before its use.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch ports commit 02b016ca7f ("ext4: enforce the immutable
flag on open files") to f2fs.
According to the chattr man page, "a file with the 'i' attribute
cannot be modified..." Historically, this was only enforced when the
file was opened, per the rest of the description, "... and the file
can not be opened in write mode".
There is general agreement that we should standardize all file systems
to prevent modifications even for files that were opened at the time
the immutable flag is set. Eventually, a change to enforce this at
the VFS layer should be landing in mainline.
Cc: stable@kernel.org
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Previously, in f2fs_setattr(), we don't update S_ISUID|S_ISGID|S_ISVTX
bits with S_IRWXUGO bits and acl entries atomically, so in error path,
chmod() may partially success, this patch enhances to make chmod() flow
being atomical.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We should update the ~S_IRWXUGO part of inode->i_mode in __setattr_copy,
because posix_acl_update_mode updates mode based on inode->i_mode,
which finally overwrites the ~S_IRWXUGO part of i_acl_mode with old i_mode.
Testcase to reproduce this bug:
0. adduser abc
1. mkfs.f2fs /dev/sdd
2. mount -t f2fs /dev/sdd /mnt/f2fs
3. mkdir /mnt/f2fs/test
4. setfacl -m u🔤r /mnt/f2fs/test
5. chmod +s /mnt/f2fs/test
Signed-off-by: Weichao Guo <guoweichao@oppo.com>
Signed-off-by: Bin Shu <shubin@oppo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
If we have large section/zone, unallocated segment makes them corrupted.
E.g.,
- Pinned file: -1 119304647 119304647
- ATGC data: -1 119304647 119304647
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Although it isn't used directly by the ioctls,
"struct fsverity_descriptor" is required by userspace programs that need
to compute fs-verity file digests in a standalone way. Therefore
it's also needed to sign files in a standalone way.
Similarly, "struct fsverity_formatted_digest" (previously called
"struct fsverity_signed_digest" which was misleading) is also needed to
sign files if the built-in signature verification is being used.
Therefore, move these structs to the UAPI header.
While doing this, try to make it clear that the signature-related fields
in fsverity_descriptor aren't used in the file digest computation.
Acked-by: Luca Boccassi <luca.boccassi@microsoft.com>
Link: https://lore.kernel.org/r/20201113211918.71883-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
I originally chose the name "file measurement" to refer to the fs-verity
file digest to avoid confusion with traditional full-file digests or
with the bare root hash of the Merkle tree.
But the name "file measurement" hasn't caught on, and usually people are
calling it something else, usually the "file digest". E.g. see
"struct fsverity_digest" and "struct fsverity_formatted_digest", the
libfsverity_compute_digest() and libfsverity_sign_digest() functions in
libfsverity, and the "fsverity digest" command.
Having multiple names for the same thing is always confusing.
So to hopefully avoid confusion in the future, rename
"fs-verity file measurement" to "fs-verity file digest".
This leaves FS_IOC_MEASURE_VERITY as the only reference to "measure" in
the kernel, which makes some amount of sense since the ioctl is actively
"measuring" the file.
I'll be renaming this in fsverity-utils too (though similarly the
'fsverity measure' command, which is a wrapper for
FS_IOC_MEASURE_VERITY, will stay).
Acked-by: Luca Boccassi <luca.boccassi@microsoft.com>
Link: https://lore.kernel.org/r/20201113211918.71883-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
The name "struct fsverity_signed_digest" is causing confusion because it
isn't actually a signed digest, but rather it's the way that the digest
is formatted in order to be signed. Rename it to
"struct fsverity_formatted_digest" to prevent this confusion.
Also update the struct's comment to clarify that it's specific to the
built-in signature verification support and isn't a requirement for all
fs-verity users.
I'll be renaming this struct in fsverity-utils too.
Acked-by: Luca Boccassi <luca.boccassi@microsoft.com>
Link: https://lore.kernel.org/r/20201113211918.71883-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Currently it's impossible to delete files that use an unsupported
encryption policy, as the kernel will just return an error when
performing any operation on the top-level encrypted directory, even just
a path lookup into the directory or opening the directory for readdir.
More specifically, this occurs in any of the following cases:
- The encryption context has an unrecognized version number. Current
kernels know about v1 and v2, but there could be more versions in the
future.
- The encryption context has unrecognized encryption modes
(FSCRYPT_MODE_*) or flags (FSCRYPT_POLICY_FLAG_*), an unrecognized
combination of modes, or reserved bits set.
- The encryption key has been added and the encryption modes are
recognized but aren't available in the crypto API -- for example, a
directory is encrypted with FSCRYPT_MODE_ADIANTUM but the kernel
doesn't have CONFIG_CRYPTO_ADIANTUM enabled.
It's desirable to return errors for most operations on files that use an
unsupported encryption policy, but the current behavior is too strict.
We need to allow enough to delete files, so that people can't be stuck
with undeletable files when downgrading kernel versions. That includes
allowing directories to be listed and allowing dentries to be looked up.
Fix this by modifying the key setup logic to treat an unsupported
encryption policy in the same way as "key unavailable" in the cases that
are required for a recursive delete to work: preparing for a readdir or
a dentry lookup, revalidating a dentry, or checking whether an inode has
the same encryption policy as its parent directory.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Now that fscrypt_get_encryption_info() is only called from files in
fs/crypto/ (due to all key setup now being handled by higher-level
helper functions instead of directly by filesystems), unexport it and
move its declaration to fscrypt_private.h.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-9-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
The last remaining use of fscrypt_get_encryption_info() from filesystems
is for readdir (->iterate_shared()). Every other call is now in
fs/crypto/ as part of some other higher-level operation.
We need to add a new argument to fscrypt_get_encryption_info() to
indicate whether the encryption policy is allowed to be unrecognized or
not. Doing this is easier if we can work with high-level operations
rather than direct filesystem use of fscrypt_get_encryption_info().
So add a function fscrypt_prepare_readdir() which wraps the call to
fscrypt_get_encryption_info() for the readdir use case.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
The call to fscrypt_get_encryption_info() in dx_show_leaf() is too low
in the call tree; fscrypt_get_encryption_info() should have already been
called when starting the directory operation. And indeed, it already
is. Moreover, the encryption key is guaranteed to already be available
because dx_show_leaf() is only called when adding a new directory entry.
And even if the key wasn't available, dx_show_leaf() uses
fscrypt_fname_disk_to_usr() which knows how to create a no-key name.
So for the above reasons, and because it would be desirable to stop
exporting fscrypt_get_encryption_info() directly to filesystems, remove
the call to fscrypt_get_encryption_info() from dx_show_leaf().
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Since encrypted directories can be opened and searched without their key
being available, and each readdir and ->lookup() tries to set up the
key, trying to set up the key in ->open() too isn't really useful.
Just remove it so that directories don't need an ->open() method
anymore, and so that we eliminate a use of fscrypt_get_encryption_info()
(which I'd like to stop exporting to filesystems).
Link: https://lore.kernel.org/r/20201203022041.230976-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Since encrypted directories can be opened and searched without their key
being available, and each readdir and ->lookup() tries to set up the
key, trying to set up the key in ->open() too isn't really useful.
Just remove it so that directories don't need an ->open() method
anymore, and so that we eliminate a use of fscrypt_get_encryption_info()
(which I'd like to stop exporting to filesystems).
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Link: https://lore.kernel.org/r/20201203022041.230976-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Since encrypted directories can be opened and searched without their key
being available, and each readdir and ->lookup() tries to set up the
key, trying to set up the key in ->open() too isn't really useful.
Just remove it so that directories don't need an ->open() method
anymore, and so that we eliminate a use of fscrypt_get_encryption_info()
(which I'd like to stop exporting to filesystems).
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20201203022041.230976-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
The stated reasons for separating fscrypt_master_key::mk_secret_sem from
the standard semaphore contained in every 'struct key' no longer apply.
First, due to commit a992b20cd4 ("fscrypt: add
fscrypt_prepare_new_inode() and fscrypt_set_context()"),
fscrypt_get_encryption_info() is no longer called from within a
filesystem transaction.
Second, due to commit d3ec10aa95 ("KEYS: Don't write out to userspace
while holding key semaphore"), the semaphore for the "keyring" key type
no longer ranks above page faults.
That leaves performance as the only possible reason to keep the separate
mk_secret_sem. Specifically, having mk_secret_sem reduces the
contention between setup_file_encryption_key() and
FS_IOC_{ADD,REMOVE}_ENCRYPTION_KEY. However, these ioctls aren't
executed often, so this doesn't seem to be worth the extra complexity.
Therefore, simplify the locking design by just using key->sem instead of
mk_secret_sem.
Link: https://lore.kernel.org/r/20201117032626.320275-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
In an encrypted directory, a regular dentry (one that doesn't have the
no-key name flag) can only be created if the directory's encryption key
is available.
Therefore the calls to fscrypt_require_key() in __fscrypt_prepare_link()
and __fscrypt_prepare_rename() are unnecessary, as these functions
already check that the dentries they're given aren't no-key names.
Remove these unnecessary calls to fscrypt_require_key().
Link: https://lore.kernel.org/r/20201118075609.120337-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on ubifs by rejecting no-key dentries in ubifs_create(),
ubifs_mkdir(), ubifs_mknod(), and ubifs_symlink().
Note that ubifs doesn't actually report the duplicate filenames from
readdir, but rather it seems to replace the original dentry with a new
one (which is still wrong, just a different effect from ext4).
On ubifs, this fixes xfstest generic/595 as well as the new xfstest I
wrote specifically for this bug.
Fixes: f4f61d2cc6 ("ubifs: Implement encrypted filenames")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on f2fs by rejecting no-key dentries in f2fs_add_link().
Note that the weird check for the current task in f2fs_do_add_link()
seems to make this bug difficult to reproduce on f2fs.
Fixes: 9ea97163c6 ("f2fs crypto: add filename encryption for f2fs_add_link")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
As described in "fscrypt: add fscrypt_is_nokey_name()", it's possible to
create a duplicate filename in an encrypted directory by creating a file
concurrently with adding the directory's encryption key.
Fix this bug on ext4 by rejecting no-key dentries in ext4_add_entry().
Note that the duplicate check in ext4_find_dest_de() sometimes prevented
this bug. However in many cases it didn't, since ext4_find_dest_de()
doesn't examine every dentry.
Fixes: 4461471107 ("ext4 crypto: enable filename encryption")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
It's possible to create a duplicate filename in an encrypted directory
by creating a file concurrently with adding the encryption key.
Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or
sys_symlink()) can lookup the target filename while the directory's
encryption key hasn't been added yet, resulting in a negative no-key
dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or
->symlink()) because the dentry is negative. Normally, ->create() would
return -ENOKEY due to the directory's key being unavailable. However,
if the key was added between the dentry lookup and ->create(), then the
filesystem will go ahead and try to create the file.
If the target filename happens to already exist as a normal name (not a
no-key name), a duplicate filename may be added to the directory.
In order to fix this, we need to fix the filesystems to prevent
->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names.
(->rename() and ->link() need it too, but those are already handled
correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)
In preparation for this, add a helper function fscrypt_is_nokey_name()
that filesystems can use to do this check. Use this helper function for
the existing checks that fs/crypto/ does for rename and link.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
There isn't really any valid reason to use __FSCRYPT_MODE_MAX or
FSCRYPT_POLICY_FLAGS_VALID in a userspace program. These constants are
only meant to be used by the kernel internally, and they are defined in
the UAPI header next to the mode numbers and flags only so that kernel
developers don't forget to update them when adding new modes or flags.
In https://lkml.kernel.org/r/20201005074133.1958633-2-satyat@google.com
there was an example of someone wanting to use __FSCRYPT_MODE_MAX in a
user program, and it was wrong because the program would have broken if
__FSCRYPT_MODE_MAX were ever increased. So having this definition
available is harmful. FSCRYPT_POLICY_FLAGS_VALID has the same problem.
So, remove these definitions from the UAPI header. Replace
FSCRYPT_POLICY_FLAGS_VALID with just listing the valid flags explicitly
in the one kernel function that needs it. Move __FSCRYPT_MODE_MAX to
fscrypt_private.h, remove the double underscores (which were only
present to discourage use by userspace), and add a BUILD_BUG_ON() and
comments to (hopefully) ensure it is kept in sync.
Keep the old name FS_POLICY_FLAGS_VALID, since it's been around for
longer and there's a greater chance that removing it would break source
compatibility with some program. Indeed, mtd-utils is using it in
an #ifdef, and removing it would introduce compiler warnings (about
FS_POLICY_FLAGS_PAD_* being redefined) into the mtd-utils build.
However, reduce its value to 0x07 so that it only includes the flags
with old names (the ones present before Linux 5.4), and try to make it
clear that it's now "frozen" and no new flags should be added to it.
Fixes: 2336d0deb2 ("fscrypt: use FSCRYPT_ prefix for uapi constants")
Cc: <stable@vger.kernel.org> # v5.4+
Link: https://lore.kernel.org/r/20201024005132.495952-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
This patch addresses minor issues in compression chksum.
Fixes: b28f047b28c5 ("f2fs: compress: support chksum")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
During recovery, we may missed to update inline xattr count correctly,
fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Many flash devices read and write a single IO based on a multiple
of 4KB, and we support only 4KB page cache size now.
Since we already check page size in init_f2fs_fs(), so remove page
size check in sanity_check_raw_super().
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Shaohua Liu <liush@allwinnertech.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch adds max_io_bytes to limit bio size when f2fs tries to merge
consecutive IOs. This can give a testing point to split out bios and check
end_io handles those bios correctly. This is used to capture a recent bug
on the decompression and fsverity flow.
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Added two ioctl to decompress/compress explicitly the compression
enabled file in "compress_mode=user" mount option.
Using these two ioctls, the users can make a control of compression
and decompression of their files.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We will add a new "compress_mode" mount option to control file
compression mode. This supports "fs" and "user". In "fs" mode (default),
f2fs does automatic compression on the compression enabled files.
In "user" mode, f2fs disables the automaic compression and gives the
user discretion of choosing the target file and the timing. It means
the user can do manual compression/decompression on the compression
enabled files using ioctls.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch removes buffer_head dependency when getting block addresses.
Light reported there's a 32bit issue in f2fs_fiemap where map_bh.b_size is
32bits while len is 64bits given by user. This will give wrong length to
f2fs_map_block.
Reported-by: Light Hsieh <Light.Hsieh@mediatek.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We should convert cur_lblock, a block count, to bytes for len.
Fixes: af4b6b8edf ("f2fs: introduce check_swap_activate_fast()")
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch renames two functions like below having u64.
- logical_to_blk to bytes_to_blks
- blk_to_logical to blks_to_bytes
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
For multi-device case, one f2fs image includes multi devices, so it
needs to account bytes written of all block devices belong to the image
rather than one main block device, fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch supports to store chksum value with compressed
data, and verify the integrality of compressed data while
reading the data.
The feature can be enabled through specifying mount option
'compress_chksum'.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Lei Li reported a issue: if foreground operations are frequent, background
checkpoint may be always skipped due to below check, result in losing more
data after sudden power-cut.
f2fs_balance_fs_bg()
...
if (!is_idle(sbi, REQ_TIME) &&
(!excess_dirty_nats(sbi) && !excess_dirty_nodes(sbi)))
return;
E.g:
cp_interval = 5 second
idle_interval = 2 second
foreground operation interval = 1 second (append 1 byte per second into file)
In such case, no matter when it calls f2fs_balance_fs_bg(), is_idle(, REQ_TIME)
returns false, result in skipping background checkpoint.
This patch changes as below to make trigger condition being more reasonable:
- trigger sync_fs() if dirty_{nats,nodes} and prefree segs exceeds threshold;
- skip triggering sync_fs() if there is any background inflight IO or there is
foreground operation recently and meanwhile cp_rwsem is being held by someone;
Reported-by: Lei Li <noctis.akm@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>