Cleanup incremental-fs left overs on umount, otherwise incr-fs will
complain as below:
BUG: Dentry {i=47a,n=.incomplete} still in use [unmount of incremental-fs]
This requires vfs_rmdir() of the special index and incomplete dirs.
Also free options.sysfs_name in incfs_mount_fs() instead of in
incfs_free_mount_info() to make it consistent with incfs_remount_fs().
Since set_anon_super() was used in incfs_mount_fs() the incfs_kill_sb()
should use kill_anon_super() instead of generic_shutdown_super()
otherwise it will leak the pseudo dev_t that set_anon_super() allocates.
Bug: 211066171
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I7ea54db63513fc130e1997cbf79121015ee12405
Syzbot recently found a number of issues related to incremental-fs
(see bug numbers below). All have to do with the fact that incr-fs
allows mounts of the same source and target multiple times.
The correct behavior for a file system is to allow only one such
mount, and then every subsequent attempt should fail with a -EBUSY
error code. In case of the issues listed below the common pattern
is that the reproducer calls:
mount("./file0", "./file0", "incremental-fs", 0, NULL)
many times and then invokes a file operation like chmod, setxattr,
or open on the ./file0. This causes a recursive call for all the
mounted instances, which eventually causes a stack overflow and
a kernel crash:
BUG: stack guard page was hit at ffffc90000c0fff8
kernel stack overflow (double-fault): 0000 [#1] PREEMPT SMP KASAN
The reason why many mounts with the same source and target are
possible is because the incfs_mount_fs() as it is allocates a new
super_block for every call, regardless of whether a given mount already
exists or not. This happens every time the sget() function is called
with a test param equal to NULL.
The correct behavior for an FS mount implementation is to call
appropriate mount vfs call for it's type, i.e. mount_bdev() for
a block device backed FS, mount_single() for a pseudo file system,
like sysfs that is mounted in a single, well know location, or
mount_nodev() for other special purpose FS like overlayfs.
In case of incremental-fs the open coded mount logic doesn't check
for abusive mount attempts such as overlays.
To fix this issue the logic needs to be changed to pass a proper
test function to sget() call, which then checks if a super_block
for a mount instance has already been allocated and also allows
the VFS to properly verify invalid mount attempts.
Bug: 211066171
Bug: 213140206
Bug: 213215835
Bug: 211914587
Bug: 211213635
Bug: 213137376
Bug: 211161296
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Change-Id: I66cfc3f1b5aaffb32b0845b2dad3ff26fe952e27
Adding seven sysfs entries per mount:
reads_failed_timed_out
reads_failed_hash_verification
reads_failed_other
reads_delayed_pending
reads_delayed_pending_us
reads_delayed_min
reads_delayed_min_us
to allow for status monitoring from userland
Change-Id: I50677511c2af4778ba0c574bb80323f31425b4d0
Test: incfs_test passes
Bug: 160634343
Bug: 184291759
Signed-off-by: Paul Lawrence <paullawrence@google.com>
For incfs files that were created without a merkle tree, enabling verity
requires building a merkle tree first. Although this is the same logic
as verity performs, it is not that easy to reconcile the two given that
incfs has the merkle tree potentially when verity is not enabled.
Bug: 160634504
Test: incfs_test passes
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: Ia15a4051fa3362820846d65859e3af76b77f8cc4
Now fsverity state is preserved across inode eviction.
Added incfs.verity xattr to track when a file is fs-verity enabled.
Bug: 160634504
Test: incfs_test passes
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I41d90abd55527884d9eff642c9834ad837ff6918
Add FS_IOC_ENABLE_VERITY ioctl
When called, calculate measurement, validate signature against fsverity,
and set S_VERITY flag.
This does not (yet) preserve the verity status once the inode is
evicted.
Bug: 160634504
Test: incfs_test passes
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I88af2721f650098accc72a64528c7d85b753c7f6
Bug: 177075428
Test: incfs_test passes
atest GtsIncrementalInstallTestCases has only 8 failures
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I73accfc1982aec1cd7947996c25a23e4a97cfdac
.blocks_writen file handling was missing some operations:
SELinux xattr handlers, safety checks for it being a
pseudo file etc.
This CL generalizes pseudo file handling so that all such
files work in a generic way and next time it should be
easier to add all operations at once.
Bug: 175823975
Test: incfs_tests pass
Change-Id: Id2b1936018c81c62c8ab4cdbaa8827e2679b513f
Signed-off-by: Yurii Zubrytskyi <zyy@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Bug: 174692664
Test: incfs_test passes, incremental installs work with ag/13082306
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: Ib1c924bbaff759f58f7d83bad8e23d7224ba7ed9
Rmove bc_mutex used to protect metadata chain, now that is only
read at file open time
Remove certain unused mount options
Bug: 172482559
Test: incfs_test passes
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: Id70e5a5d08e5de79f391e19ea97e356f39a3ed51
Use Read-Write locks for reading/writing segment in blockmap.
This should allow parallel reads when there are
multiple reads within same segment.
A small optimization in pending_reads_read(). Since
incfs_collect_pending_reads() already iterate to
populate buffer, new_max_sn - highest serial number
among all the pending read buffer can be done in the same
loop instead of looping again in pending_reads_read().
Bug: 161566104
Test: kernel selftest - incfs_test and incfs_perf
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Id00376b0e4cb8c0c0bc8264cdddd6f38c4aa85f0
1: Invoke kunmap(page) in error path
2: Validate NULL checks at few places in the code.
3: path_put() should not be invoked if path entry is null.
Although path_put() checks for NULL condition internally,
caller should gracefully handle it.
Bug: 161565969
Test: kernel selftest - incfs_test, incfs_perf
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Ie4dfaaba4b09f4798d492f8a25dd9dcc8da89e51
Use RCU locks instead of pending_reads_mutex.
Current mutex is taking lock on entire mount_info
structure which seems a heavy operation.
Following fields of mount_info structure
are protected through spinlocks for multiple
writers and are RCU safe for readers:
- reads_list_head
- mi_pending_reads_count
- mi_last_pending_read_number
- data_file_segment.reads_list_head
We could probably use atomic_inc/atomic_dec for
mi_pending_reads_count and mi_last_pending_read_number
which can futher cut down spin_locks at couple of more places,
thereby only the list addition and removal can protected
by spinlock. This CL doesn't address it.
Bug: 161565969
Test: kernel selftest incfs_test and incfs_perf
Signed-off-by: Akilesh Kailash <akailash@google.com>
Change-Id: Iad7439657016764dce25d64c8b3df69b930452bc
This reverts commit ab185e45f6.
This change used the PageChecked flag to mark the Merkle tree as
checked. However, f2fs uses this internally. This caused file system
hangs on devices after installs.
Test: incfs_test passes, installs no longer hang
Bug: 157589629
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I980a700d65eb4f4a77434715d61dda4b8e80658c
Waking up the waiters accounts for 80+% of the total logging
time, and about 40% of overall read_single_page() with no
signature verification. By throttling it to once every 16ms
we get back all read performance, reduce the waiter's CPU
usage and still leave it enough time to pull the logs out.
Bug: 155996534
Test: adb install megacity.apk & dd from the installed apk
Signed-off-by: Yurii Zubrytskyi <zyy@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I4a118dc226d7ca318cf099ba3e239f0120bb23c2
With a verified file (use incfs_perf to create a verified file), throughput
measured using dd after dropping caches increases from 200M/s to 290M/s
Test: incfs_test passes
Bug: 155996534
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I7abb5ad92e4167f82f3452acc9db322fec8307dd
Bug: 153560805
Test: incfs_test passes on qemu and Pixel 4
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I1b55341e4e4247a74f3f539b9d190fef0ca409b8
Read log buffer can have multiple threads doing any of these
operations simultaneously:
- Polling for changes
- Reading log records
- Adding new log records
- Updating log buffer size, or enabling/disabling it completely
As we don't control the userspace, and it turns out that they
all currently originate from different processes, code needs to
be safe against parallel access to a read buffer and a request
for reallocating it.
This CL add an r/w spinlock to protect the buffer and its size.
Each remount takes the write lock, while everything else takes
a read lock. Remount makes sure it doesn't take too long by
preallocating and precalculating all updates, while other
operations don't care much about their critical section size -
they all can still run together.
Bug: 152633648
Test: manual remount + reading
Signed-off-by: Yurii Zubrytskyi <zyy@google.com>
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I7271b4cb89f1ae2cbee6e5b073758f344c4ba66a
Found by sparse
Bug: 153174547
Test: make C=2 fs/incfs/incrementalfs.ko no errors, incfs_test pass
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I9ff4f4f35975fe09936724488b96cd8bdeeb719e
This led to a 20x speed improvement on QEMU. 512 is somewhat
arbitrary - most of the gains are already there reading 64 records
at a time, but since the record size is 10 bytes, 512 is just over
a page and seems a good choice.
Bug: 153170997
Test: incfs_test passes. Adding logging to incfs_get_filled_blocks
to measure performance shows a 20x improvement
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: Ifb2da77cfd8c9d653c7047ba1eb7f39d795fa1c2
Since INCFS_IOC_GET_FILLED_BLOCKS potentially leaks information about usage
patterns, and is only useful to someone filling the file, best protect it in
the same way as INCFS_IOC_FILL_BLOCKS.
Add useful field data_block_out as well
Test: incfs_test passes
Bug: 152983639
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Change-Id: I126a8cf711e56592479093e9aadbfd0e7f700752
When read log is 0 sized, we still need to init the wait queue to avoid
kernel panics if someone does decide to poll on the read log.
Test: Added test for this condition, incfs_test crashes
With fix, incfs_test doesn't crash
Signed-off-by: Paul Lawrence <paullawrence@google.com>
Bug: 152909243
Change-Id: Ic3250523bb7ddb1839f8e95852c17103e5ffb782
When returning incomplete results index_out has to be usable to
call the function again and resume from the same location. This
means that if the output buffer was too small the function needs
to check for that when encountering the _beginning_ of a next
output range, not the end of it. Otherwise resuming from the
end of the range that didn't fit into the buffer would cause
the call to never return that range
+ Make the backing file header flags update thread safe
Bug: 152691988
Test: libincfs-test, incfs_test passes
Signed-off-by: Yurii Zubrytskyi <zyy@google.com>
Change-Id: I351156beba0b74e1942a39117279d3fcdb5e0c78
Signed-off-by: Paul Lawrence <paullawrence@google.com>