Immediate packets should only be sent to peer when there are new
receive credits made available. New credits show up on freeing
receive buffer, not on receiving data.
Fix this by avoid unnenecessary work schedules.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
When processing errors from ib_post_send(), the transport state needs to be
rolled back to the condition before the error.
Refactor the old code to make it easy to roll back on IB errors, and fix this.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CIFS uses pre-allocated crypto structures to calculate signatures for both
incoming and outgoing packets. In this way it doesn't need to allocate crypto
structures for every packet, but it requires a lock to prevent concurrent
access to crypto structures.
Remove the lock by allocating crypto structures on the fly for
incoming packets. At the same time, we can still use pre-allocated crypto
structures for outgoing packets, as they are already protected by transport
lock srv_mutex.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Recevie credits should be updated before sending the packet, not
before a work is scheduled. Also, the value needs roll back if
something fails and cannot send.
Signed-off-by: Long Li <longli@microsoft.com>
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Sometimes the remote peer may return more send credits than the send queue
depth. If all the send credits are used to post senasd, we may overflow the
send queue.
Fix this by checking the send queue size before posting a send.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
As an optimization, SMBD tries to track two types of packets: packets with
payload and without payload. There is no obvious benefit or performance gain
to separately track two types of packets.
Just treat them as pending packets and merge the tracking code.
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Fix tcon use-after-free and NULL ptr deref.
Customer system crashes with the following kernel log:
[462233.169868] CIFS VFS: Cancelling wait for mid 4894753 cmd: 14 => a QUERY DIR
[462233.228045] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.305922] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.306205] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.347060] CIFS VFS: cifs_put_smb_ses: Session Logoff failure rc=-4
[462233.347107] CIFS VFS: Close unmatched open
[462233.347113] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
...
[exception RIP: cifs_put_tcon+0xa0] (this is doing tcon->ses->server)
#6 [...] smb2_cancelled_close_fid at ... [cifs]
#7 [...] process_one_work at ...
#8 [...] worker_thread at ...
#9 [...] kthread at ...
The most likely explanation we have is:
* When we put the last reference of a tcon (refcount=0), we close the
cached share root handle.
* If closing a handle is interrupted, SMB2_close() will
queue a SMB2_close() in a work thread.
* The queued object keeps a tcon ref so we bump the tcon
refcount, jumping from 0 to 1.
* We reach the end of cifs_put_tcon(), we free the tcon object despite
it now having a refcount of 1.
* The queued work now runs, but the tcon, ses & server was freed in
the meantime resulting in a crash.
THREAD 1
========
cifs_put_tcon => tcon refcount reach 0
SMB2_tdis
close_shroot_lease
close_shroot_lease_locked => if cached root has lease && refcount = 0
smb2_close_cached_fid => if cached root valid
SMB2_close => retry close in a thread if interrupted
smb2_handle_cancelled_close
__smb2_handle_cancelled_close => !! tcon refcount bump 0 => 1 !!
INIT_WORK(&cancelled->work, smb2_cancelled_close_fid);
queue_work(cifsiod_wq, &cancelled->work) => queue work
tconInfoFree(tcon); ==> freed!
cifs_put_smb_ses(ses); ==> freed!
THREAD 2 (workqueue)
========
smb2_cancelled_close_fid
SMB2_close(0, cancelled->tcon, ...); => use-after-free of tcon
cifs_put_tcon(cancelled->tcon); => tcon refcount reach 0 second time
*CRASH*
Fixes: d919131935 ("CIFS: Close cached root handle only if it has a lease")
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
An earlier commit "io_uring: remove @nxt from handlers" removed the
setting of pointer nxt and now it is always null, hence the non-null
check and call to io_wq_assign_next is redundant and can be removed.
Addresses-Coverity: ("'Constant' variable guard")
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of the various open coded calls to set the NFS_INO_STALE bit
and call nfs_zap_caches(), consolidate them into a single function
nfs_set_inode_stale().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Pull fsnotify updates from Jan Kara:
"This implements the fanotify FAN_DIR_MODIFY event.
This event reports the name in a directory under which a change
happened and together with the directory filehandle and fstatat()
allows reliable and efficient implementation of directory
synchronization"
* tag 'fsnotify_for_v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify: Fix the checks in fanotify_fsid_equal
fanotify: report name info for FAN_DIR_MODIFY event
fanotify: record name info for FAN_DIR_MODIFY event
fanotify: Drop fanotify_event_has_fid()
fanotify: prepare to report both parent and child fid's
fanotify: send FAN_DIR_MODIFY event flavor with dir inode and name
fanotify: divorce fanotify_path_event and fanotify_fid_event
fanotify: Store fanotify handles differently
fanotify: Simplify create_fd()
fanotify: fix merging marks masks with FAN_ONDIR
fanotify: merge duplicate events on parent and child
fsnotify: replace inode pointer with an object id
fsnotify: simplify arguments passing to fsnotify_parent()
fsnotify: use helpers to access data by data_type
fsnotify: funnel all dirent events through fsnotify_name()
fsnotify: factor helpers fsnotify_dentry() and fsnotify_file()
fsnotify: tidy up FS_ and FAN_ constants
Pull ext2/udf updates from Jan Kara:
"Cleanups and fixes for ext2 and one cleanup for udf"
* tag 'for_v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: fix empty body warnings when -Wextra is used
ext2: fix debug reference to ext2_xattr_cache
udf: udf_sb.h: Replace zero-length array with flexible-array member
ext2: xattr.h: Replace zero-length array with flexible-array member
ext2: Silence lockdep warning about reclaim under xattr_sem
Pull 9p updates from Dominique Martinet:
"Not much new, but a few patches for this cycle:
- Fix read with O_NONBLOCK to allow incomplete read and return
immediately
- Rest is just cleanup (indent, unused field in struct, extra
semicolon)"
* tag '9p-for-5.7' of git://github.com/martinetd/linux:
net/9p: remove unused p9_req_t aux field
9p: read only once on O_NONBLOCK
9pnet: allow making incomplete read requests
9p: Remove unneeded semicolon
9p: Fix Kconfig indentation
Reflink should force the log out to disk if the filesystem was mounted
with wsync, the same as most other operations in xfs.
[Note: XFS_MOUNT_WSYNC is set when the admin mounts the filesystem
with either the 'wsync' or 'sync' mount options, which effectively means
that we're classifying reflink/dedupe as IO operations and making them
synchronous when required.]
Fixes: 3fc9f5e409 ("xfs: remove xfs_reflink_remap_range")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
[darrick: add more to the changelog]
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Pull vfs pathwalk fix from Al Viro:
"Dumb braino in legitimize_path()..."
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix a braino in legitimize_path()
brown paperbag time... wrong order of arguments ended up confusing
the values to check dentry and mount_lock seqcounts against.
Reported-by: kernel test robot <rong.a.chen@intel.com>
Fixes: 2aa3847085 ("non-RCU analogue of the previous commit")
Tested-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If io_get_req() fails, it drops a ref. Then, awhile keeping @submitted
unmodified, io_submit_sqes() breaks the loop and puts @nr - @submitted
refs. For each submitted req a ref is dropped in io_put_req() and
friends. So, for @nr taken refs there will be
(@nr - @submitted + @submitted + 1) dropped.
Remove ctx refcounting from io_get_req(), that at the same time makes
it clearer.
Fixes: 2b85edfc0c ("io_uring: batch getting pcpu references")
Cc: stable@vger.kernel.org # v5.6
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit 9255782f70 ("sysfs: Wrap __compat_only_sysfs_link_entry_to_kobj
function to change the symlink name") made this function a wrapper
around a new non-underscored function, which is a bit odd. The normal
naming convention is the other way around: the underscored function is
the wrappee, and the non-underscored function is the wrapper.
There's only one single user (well, two call-sites in that user) of the
more limited double underscore version of this function, so just remove
the oddly named wrapper entirely and just add the extra NULL argument to
the user.
I considered just doing that in the merge, but that tends to make
history really hard to read.
Link: https://lore.kernel.org/lkml/CAHk-=wgkkmNV5tMzQDmPAQuNJBuMcry--Jb+h8H1o4RA3kF7QQ@mail.gmail.com/
Cc: Sourabh Jain <sourabhjain@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull powerpc updates from Michael Ellerman:
"Slightly late as I had to rebase mid-week to insert a bug fix:
- A large series from Nick for 64-bit to further rework our exception
vectors, and rewrite portions of the syscall entry/exit and
interrupt return in C. The result is much easier to follow code
that is also faster in general.
- Cleanup of our ptrace code to split various parts out that had
become badly intertwined with #ifdefs over the years.
- Changes to our NUMA setup under the PowerVM hypervisor which should
hopefully avoid non-sensical topologies which can lead to warnings
from the workqueue code and other problems.
- MAINTAINERS updates to remove some of our old orphan entries and
update the status of others.
- Quite a few other small changes and fixes all over the map.
Thanks to: Abdul Haleem, afzal mohammed, Alexey Kardashevskiy, Andrew
Donnellan, Aneesh Kumar K.V, Balamuruhan S, Cédric Le Goater, Chen
Zhou, Christophe JAILLET, Christophe Leroy, Christoph Hellwig, Clement
Courbet, Daniel Axtens, David Gibson, Douglas Miller, Fabiano Rosas,
Fangrui Song, Ganesh Goudar, Gautham R. Shenoy, Greg Kroah-Hartman,
Greg Kurz, Gustavo Luiz Duarte, Hari Bathini, Ilie Halip, Jan Kara,
Joe Lawrence, Joe Perches, Kajol Jain, Larry Finger, Laurentiu Tudor,
Leonardo Bras, Libor Pechacek, Madhavan Srinivasan, Mahesh Salgaonkar,
Masahiro Yamada, Masami Hiramatsu, Mauricio Faria de Oliveira, Michael
Neuling, Michal Suchanek, Mike Rapoport, Nageswara R Sastry, Nathan
Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick
Desaulniers, Oliver O'Halloran, Po-Hsu Lin, Pratik Rajesh Sampat,
Rasmus Villemoes, Ravi Bangoria, Roman Bolshakov, Sam Bobroff,
Sandipan Das, Santosh S, Sedat Dilek, Segher Boessenkool, Shilpasri G
Bhat, Sourabh Jain, Srikar Dronamraju, Stephen Rothwell, Tyrel
Datwyler, Vaibhav Jain, YueHaibing"
* tag 'powerpc-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (158 commits)
powerpc: Make setjmp/longjmp signature standard
powerpc/cputable: Remove unnecessary copy of cpu_spec->oprofile_type
powerpc: Suppress .eh_frame generation
powerpc: Drop -fno-dwarf2-cfi-asm
powerpc/32: drop unused ISA_DMA_THRESHOLD
powerpc/powernv: Add documentation for the opal sensor_groups sysfs interfaces
selftests/powerpc: Fix try-run when source tree is not writable
powerpc/vmlinux.lds: Explicitly retain .gnu.hash
powerpc/ptrace: move ptrace_triggered() into hw_breakpoint.c
powerpc/ptrace: create ppc_gethwdinfo()
powerpc/ptrace: create ptrace_get_debugreg()
powerpc/ptrace: split out ADV_DEBUG_REGS related functions.
powerpc/ptrace: move register viewing functions out of ptrace.c
powerpc/ptrace: split out TRANSACTIONAL_MEM related functions.
powerpc/ptrace: split out SPE related functions.
powerpc/ptrace: split out ALTIVEC related functions.
powerpc/ptrace: split out VSX related functions.
powerpc/ptrace: drop PARAMETER_SAVE_AREA_OFFSET
powerpc/ptrace: drop unnecessary #ifdefs CONFIG_PPC64
powerpc/ptrace: remove unused header includes
...
Pull ext4 updates from Ted Ts'o:
- Replace ext4's bmap and iopoll implementations to use iomap.
- Clean up extent tree handling.
- Other cleanups and miscellaneous bug fixes
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (31 commits)
ext4: save all error info in save_error_info() and drop ext4_set_errno()
ext4: fix incorrect group count in ext4_fill_super error message
ext4: fix incorrect inodes per group in error message
ext4: don't set dioread_nolock by default for blocksize < pagesize
ext4: disable dioread_nolock whenever delayed allocation is disabled
ext4: do not commit super on read-only bdev
ext4: avoid ENOSPC when avoiding to reuse recently deleted inodes
ext4: unregister sysfs path before destroying jbd2 journal
ext4: check for non-zero journal inum in ext4_calculate_overhead
ext4: remove map_from_cluster from ext4_ext_map_blocks
ext4: clean up ext4_ext_insert_extent() call in ext4_ext_map_blocks()
ext4: mark block bitmap corrupted when found instead of BUGON
ext4: use flexible-array member for xattr structs
ext4: use flexible-array member in struct fname
Documentation: correct the description of FIEMAP_EXTENT_LAST
ext4: move ext4_fiemap to use iomap framework
ext4: make ext4_ind_map_blocks work with fiemap
ext4: move ext4 bmap to use iomap infrastructure
ext4: optimize ext4_ext_precache for 0 depth
ext4: add IOMAP_F_MERGED for non-extent based mapping
...
Pull nfsd updates from Chuck Lever:
- Fix EXCHANGE_ID response when NFSD runs in a container
- A battery of new static trace points
- Socket transports now use bio_vec to send Replies
- NFS/RDMA now supports filesystems with no .splice_read method
- Favor memcpy() over DMA mapping for small RPC/RDMA Replies
- Add pre-requisites for supporting multiple Write chunks
- Numerous minor fixes and clean-ups
[ Chuck is filling in for Bruce this time while he and his family settle
into a new house ]
* tag 'nfsd-5.7' of git://git.linux-nfs.org/projects/cel/cel-2.6: (39 commits)
svcrdma: Fix leak of transport addresses
SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()'
SUNRPC/cache: don't allow invalid entries to be flushed
nfsd: fsnotify on rmdir under nfsd/clients/
nfsd4: kill warnings on testing stateids with mismatched clientids
nfsd: remove read permission bit for ctl sysctl
NFSD: Fix NFS server build errors
sunrpc: Add tracing for cache events
SUNRPC/cache: Allow garbage collection of invalid cache entries
nfsd: export upcalls must not return ESTALE when mountd is down
nfsd: Add tracepoints for update of the expkey and export cache entries
nfsd: Add tracepoints for exp_find_key() and exp_get_by_name()
nfsd: Add tracing to nfsd_set_fh_dentry()
nfsd: Don't add locks to closed or closing open stateids
SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends
SUNRPC: Refactor xs_sendpages()
svcrdma: Avoid DMA mapping small RPC Replies
svcrdma: Fix double sync of transport header buffer
svcrdma: Refactor chunk list encoders
SUNRPC: Add encoders for list item discriminators
...
When we're sending a layoutreturn, ensure that we reference the
layout cred atomically with the copy of the stateid.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
When we look up the delegation cred, we are usually doing so in
conjunction with a read of the stateid, and we want to ensure
that the look up is atomic with that read.
Fixes: 57f188e047 ("NFSv4: nfs_update_inplace_delegation() should update delegation cred")
[sfr@canb.auug.org.au: Fixed up borken Fixes: line from Trond :-)]
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
A request that completes with an -EAGAIN result after it has been added
to the poll list, will not be removed from that list in io_do_iopoll()
because the f_op->iopoll() will not succeed for that request.
Maintain a retryable local list similar to the done list, and explicity
reissue requests with an -EAGAIN result.
Signed-off-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull SPDX updates from Greg KH:
"Here are three SPDX patches for 5.7-rc1.
One fixes up the SPDX tag for a single driver, while the other two go
through the tree and add SPDX tags for all of the .gitignore files as
needed.
Nothing too complex, but you will get a merge conflict with your
current tree, that should be trivial to handle (one file modified by
two things, one file deleted.)
All three of these have been in linux-next for a while, with no
reported issues other than the merge conflict"
* tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
ASoC: MT6660: make spdxcheck.py happy
.gitignore: add SPDX License Identifier
.gitignore: remove too obvious comments
We already checked this limit when the file was opened, and we keep it
open in the file table. Hence when we added unit_inflight to the count
we want to register, we're doubly accounting these files. This results
in -EMFILE for file registration, if we're at half the limit.
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull cgroup updates from Tejun Heo:
- Christian extended clone3 so that processes can be spawned into
cgroups directly.
This is not only neat in terms of semantics but also avoids grabbing
the global cgroup_threadgroup_rwsem for migration.
- Daniel added !root xattr support to cgroupfs.
Userland already uses xattrs on cgroupfs for bookkeeping. This will
allow delegated cgroups to support such usages.
- Prateek tried to make cpuset hotplug handling synchronous but that
led to possible deadlock scenarios. Reverted.
- Other minor changes including release_agent_path handling cleanup.
* 'for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
docs: cgroup-v1: Document the cpuset_v2_mode mount option
Revert "cpuset: Make cpuset hotplug synchronous"
cgroupfs: Support user xattrs
kernfs: Add option to enable user xattrs
kernfs: Add removed_size out param for simple_xattr_set
kernfs: kvmalloc xattr value instead of kmalloc
cgroup: Restructure release_agent_path handling
selftests/cgroup: add tests for cloning into cgroups
clone3: allow spawning processes into cgroups
cgroup: add cgroup_may_write() helper
cgroup: refactor fork helpers
cgroup: add cgroup_get_from_file() helper
cgroup: unify attach permission checking
cpuset: Make cpuset hotplug synchronous
cgroup.c: Use built-in RCU list checking
kselftest/cgroup: add cgroup destruction test
cgroup: Clean up css_set task traversal
If the original task is (or has) exited, then the task work will not get
queued properly. Allow for using the io-wq manager task to queue this
work for execution, and ensure that the io-wq manager notices and runs
this work if woken up (or exiting).
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We can have a task exit if it's not the owner of the ring. Be safe and
grab an actual reference to it, to avoid a potential use-after-free.
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If we get woken and the poll doesn't match our mask, re-add the task
to the poll waitqueue and try again instead of completing the request
with a mask of 0.
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We can keep compressed inode's data inline before inline conversion.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
It needs to call f2fs_disable_compressed_file() to disable
compression on directory.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Compression sysfs node should not be shown if f2fs module disables
compression feature.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
While checking discard timeout, we use specified type
UMOUNT_DISCARD_TIMEOUT, so just replace doplicy.timeout with
it, and switch doplicy.timeout to bool type.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In below error path, tpages[i] could be NULL, fix to check it before
releasing it.
- f2fs_read_multi_pages
- f2fs_alloc_dic
- f2fs_free_dic
Fixes: 61fbae2b2b ("f2fs: fix to avoid NULL pointer dereference")
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
fstest reports below message when compression is on:
generic/424 1s ... - output mismatch
--- tests/generic/424.out
+++ results/generic/424.out.bad
@@ -1,2 +1,26 @@
QA output created by 424
+[!] Attribute compressed should be set
+Failed
+stat_test failed
+[!] Attribute compressed should be set
+Failed
+stat_test failed
We missed to set STATX_ATTR_COMPRESSED on compressed inode in getattr(),
fix it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Add zstd compress algorithm support, use "compress_algorithm=zstd"
mountoption to enable it.
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Setting nfs_mountpoint_expiry_timeout() to a negative value stops
mountpoint expiration, while setting it to a positive value restarts
the scheduler.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
We must not return from nfs_d_automount() without holding 2 references
to the mount record. Doing so, will trigger the BUG() in finish_automount().
Also ensure that we don't try to reschedule the automount timer with
a negative or zero timeout value.
Fixes: 22a1ae9a93 ("NFS: If nfs_mountpoint_expiry_timeout < 0, do not expire submounts")
Cc: stable@vger.kernel.org # v5.5+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
nfs_vers_tokens, nfs_xprt_protocol_tokens, and nfs_secflavor_tokens were
all missing an empty item at the end of the array, allowing
lookup_constant() to potentially walk off the end and trigger and oops.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Fixes: e38bb238ed ("NFS: Convert mount option parsing to use functionality from fs_parser.h")
Cc: stable@vger.kernel.org # v5.6
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Merge updates from Andrew Morton:
"A large amount of MM, plenty more to come.
Subsystems affected by this patch series:
- tools
- kthread
- kbuild
- scripts
- ocfs2
- vfs
- mm: slub, kmemleak, pagecache, gup, swap, memcg, pagemap, mremap,
sparsemem, kasan, pagealloc, vmscan, compaction, mempolicy,
hugetlbfs, hugetlb"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits)
include/linux/huge_mm.h: check PageTail in hpage_nr_pages even when !THP
mm/hugetlb: fix build failure with HUGETLB_PAGE but not HUGEBTLBFS
selftests/vm: fix map_hugetlb length used for testing read and write
mm/hugetlb: remove unnecessary memory fetch in PageHeadHuge()
mm/hugetlb.c: clean code by removing unnecessary initialization
hugetlb_cgroup: add hugetlb_cgroup reservation docs
hugetlb_cgroup: add hugetlb_cgroup reservation tests
hugetlb: support file_region coalescing again
hugetlb_cgroup: support noreserve mappings
hugetlb_cgroup: add accounting for shared mappings
hugetlb: disable region_add file_region coalescing
hugetlb_cgroup: add reservation accounting for private mappings
mm/hugetlb_cgroup: fix hugetlb_cgroup migration
hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations
hugetlb_cgroup: add hugetlb_cgroup reservation counter
hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race
hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
mm/memblock.c: remove redundant assignment to variable max_addr
mm: mempolicy: require at least one nodeid for MPOL_PREFERRED
mm: mempolicy: use VM_BUG_ON_VMA in queue_pages_test_walk()
...
Pull xfs updates from Darrick Wong:
"There's a lot going on this cycle with cleanups in the log code, the
btree code, and the xattr code.
We're tightening of metadata validation and online fsck checking, and
introducing a common btree rebuilding library so that we can refactor
xfs_repair and introduce online repair in a future cycle.
We also fixed a few visible bugs -- most notably there's one in
getdents that we introduced in 5.6; and a fix for hangs when disabling
quotas.
This series has been running fstests & other QA in the background for
over a week and looks good so far.
I anticipate sending a second pull request next week. That batch will
change how xfs interacts with memory reclaim; how the log batches and
throttles log items; how hard writes near ENOSPC will try to squeeze
more space out of the filesystem; and hopefully fix the last of the
umount hangs after a catastrophic failure. That should ease a lot of
problems when running at the limits, but for now I'm leaving that in
for-next for another week to make sure we got all the subtleties
right.
Summary:
- Fix a hard to trigger race between iclog error checking and log
shutdown.
- Strengthen the AGF verifier.
- Ratelimit some of the more spammy error messages.
- Remove the icdinode uid/gid members and just use the ones in the
vfs inode.
- Hold ILOCK across insert/collapse range.
- Clean up the extended attribute interfaces.
- Clean up the attr flags mess.
- Restore PF_MEMALLOC after exiting xfsaild thread to avoid
triggering warnings in the process accounting code.
- Remove the flexibly-sized array from struct xfs_agfl to eliminate
compiler warnings about unaligned pointers and packed structures.
- Various macro and typedef removals.
- Stale metadata buffers if we decide they're corrupt outside of a
verifier.
- Check directory data/block/free block owners.
- Fix a UAF when aborting inactivation of a corrupt xattr fork.
- Teach online scrub to report failed directory and attr name lookups
as a metadata corruption instead of a runtime error.
- Avoid potential buffer overflows in sysfs files by using scnprintf.
- Fix a regression in getdents lookups due to a mistake in pointer
arithmetic.
- Refactor btree cursor private data structures to use anonymous
unions.
- Cleanups in the log unmounting code.
- Fix a potential mishandling of ENOMEM errors on multi-block
directory buffer lookups.
- Fix an incorrect test in the block allocation code.
- Cleanups and name prefix shortening in the scrub code.
- Introduce btree bulk loading code for online repair and scrub.
- Fix a quotaoff log item leak (and hang) when the fs goes down
midway through a quotaoff operation.
- Remove di_version from the incore inode.
- Refactor some of the log shutdown checking code.
- Record the forcing of the log unmount records in the log force
counters.
- Fix a longstanding bug where quotacheck would purge the
administrator's default quota grace interval and warning limits.
- Reduce memory usage when scrubbing directory and xattr trees.
- Don't let fsfreeze race with GETFSMAP or online scrub.
- Handle bio_add_page failures more gracefully in xlog_write_iclog"
* tag 'xfs-5.7-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (108 commits)
xfs: prohibit fs freezing when using empty transactions
xfs: shutdown on failure to add page to log bio
xfs: directory bestfree check should release buffers
xfs: drop all altpath buffers at the end of the sibling check
xfs: preserve default grace interval during quotacheck
xfs: remove xlog_state_want_sync
xfs: move the ioerror check out of xlog_state_clean_iclog
xfs: refactor xlog_state_clean_iclog
xfs: remove the aborted parameter to xlog_state_done_syncing
xfs: simplify log shutdown checking in xfs_log_release_iclog
xfs: simplify the xfs_log_release_iclog calling convention
xfs: factor out a xlog_wait_on_iclog helper
xfs: merge xlog_cil_push into xlog_cil_push_work
xfs: remove the di_version field from struct icdinode
xfs: simplify a check in xfs_ioctl_setattr_check_cowextsize
xfs: simplify di_flags2 inheritance in xfs_ialloc
xfs: only check the superblock version for dinode size calculation
xfs: add a new xfs_sb_version_has_v3inode helper
xfs: fix unmount hang and memory leak on shutdown during quotaoff
xfs: factor out quotaoff intent AIL removal and memory free
...
Pull hibernation fix from Darrick Wong:
"Fix a regression where we broke the userspace hibernation driver by
disallowing writes to the swap device"
* tag 'vfs-5.7-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
hibernate: Allow uswsusp to write to swap