android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Darrick J. Wong	d0018ad889	xfs: inode scrubber shouldn't bother with raw checks The inode scrubber tries to _iget the inode prior to running checks. If that _iget call fails with corruption errors that's an automatic fail, regardless of whether it was the inode buffer read verifier, the ifork verifier, or the ifork formatter that errored out. Therefore, get rid of the raw mode scrub code because it's not needed. Found by trying to fix some test failures in xfs/379 and xfs/415. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-03-23 18:05:08 -07:00
Darrick J. Wong	5e777b62b0	xfs: bmap scrubber should do rmap xref with bmap for sparse files When we're scanning an extent mapping inode fork, ensure that every rmap record for this ifork has a corresponding bmbt record too. This (mostly) provides the ability to cross-reference rmap records with bmap data. The rmap scrubber cannot do the xref on its own because that requires taking an ilock with the agf lock held, which violates our locking order rules (inode, then agf). Note that we only do this for forks that are in btree format due to the increased complexity; or forks that should have data but suspiciously have zero extents because the inode could have just had its iforks zapped by the inode repair code and now we need to reclaim the old extents. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-03-23 18:05:07 -07:00
Darrick J. Wong	6edb181053	xfs: refactor inode buffer verifier error logging When the inode buffer verifier encounters an error, it's much more helpful to print a buffer from the offending inode instead of just the start of the inode chunk buffer. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-03-23 18:05:07 -07:00
Darrick J. Wong	90a58f9571	xfs: refactor inode verifier error logging Refactor some of the inode verifier failure logging call sites to use the new xfs_inode_verifier_error method which dumps the offending buffer as well as the code location of the failed check. This trims the output, makes it clearer to the admin that repair must be run, and gives the developers more details to work from. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-03-23 18:05:07 -07:00
Darrick J. Wong	30b0984d91	xfs: refactor bmap record validation Refactor the bmap validator into a more complete helper that looks for extents that run off the end of the device, overflow into the next AG, or have invalid flag states. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-03-23 18:05:07 -07:00
Darrick J. Wong	6915ef35c0	xfs: sanity-check the unused space before trying to use it In xfs_dir2_data_use_free, we examine on-disk metadata and ASSERT if it doesn't make sense. Since a carefully crafted fuzzed image can cause the kernel to crash after blowing a bunch of assertions, let's move those checks into a validator function and rig everything up to return EFSCORRUPTED to userspace. Found by lastbit fuzzing ltail.bestcount via xfs/391. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>	2018-03-23 18:05:07 -07:00
Brian Foster	a27ba2607e	xfs: detect agfl count corruption and reset agfl The struct xfs_agfl v5 header was originally introduced with unexpected padding that caused the AGFL to operate with one less slot than intended. The header has since been packed, but the fix left an incompatibility for users who upgrade from an old kernel with the unpacked header to a newer kernel with the packed header while the AGFL happens to wrap around the end. The newer kernel recognizes one extra slot at the physical end of the AGFL that the previous kernel did not. The new kernel will eventually attempt to allocate a block from that slot, which contains invalid data, and cause a crash. This condition can be detected by comparing the active range of the AGFL to the count. While this detects a padding mismatch, it can also trigger false positives for unrelated flcount corruption. Since we cannot distinguish a size mismatch due to padding from unrelated corruption, we can't trust the AGFL enough to simply repopulate the empty slot. Instead, avoid unnecessarily complex detection logic and and use a solution that can handle any form of flcount corruption that slips through read verifiers: distrust the entire AGFL and reset it to an empty state. Any valid blocks within the AGFL are intentionally leaked. This requires xfs_repair to rectify (which was already necessary based on the state the AGFL was found in). The reset mitigates the side effect of the padding mismatch problem from a filesystem crash to a free space accounting inconsistency. The generic approach also means that this patch can be safely backported to kernels with or without a packed struct xfs_agfl. Check the AGF for an invalid freelist count on initial read from disk. If detected, set a flag on the xfs_perag to indicate that a reset is required before the AGFL can be used. In the first transaction that attempts to use a flagged AGFL, reset it to empty, warn the user about the inconsistency and allow the freelist fixup code to repopulate the AGFL with new blocks. The xfs_perag flag is cleared to eliminate the need for repeated checks on each block allocation operation. This allows kernels that include the packing fix commit `96f859d52b` ("libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct") to handle older unpacked AGFL formats without a filesystem crash. Suggested-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by Dave Chiluk <chiluk+linuxxfs@indeed.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-03-23 18:05:06 -07:00
Christoph Hellwig	3e4da466bf	xfs: unwind the try_again loop in xfs_log_force Instead split out a __xfs_log_fore_lsn helper that gets called again with the already_slept flag set to true in case we had to sleep. This prepares for aio_fsync support. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-03-23 18:05:06 -07:00
Christoph Hellwig	93806299b5	xfs: refactor xfs_log_force_lsn Use the the smallest possible loop as preable to find the correct iclog buffer, and then use gotos for unwinding to straighten the code. Also fix the top of function comment while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-03-23 18:05:06 -07:00
Andreas Gruenbacher	bb491ce67a	gfs2: Check for the end of metadata in punch_hole When punching a hole or truncating an inode down to a given size, also check if the truncate point / start of the hole is within the range we have metadata for. Otherwise, we can end up freeing blocks that shouldn't be freed, corrupting the inode, or crashing the machine when trying to punch a hole into the void. When growing an inode via truncate, we set the new size but we don't allocate additional levels of indirect blocks and grow the inode height. When shrinking that inode again, the new size may still point beyond the end of the inode's metadata. Fixes xfstest generic/476. Debugged-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2018-03-23 11:43:02 -07:00
David S. Miller	03fe2debbb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Fun set of conflict resolutions here... For the mac80211 stuff, these were fortunately just parallel adds. Trivially resolved. In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the function phy_disable_interrupts() earlier in the file, whilst in 'net-next' the phy_error() call from this function was removed. In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the 'rt_table_id' member of rtable collided with a bug fix in 'net' that added a new struct member "rt_mtu_locked" which needs to be copied over here. The mlxsw driver conflict consisted of net-next separating the span code and definitions into separate files, whilst a 'net' bug fix made some changes to that moved code. The mlx5 infiniband conflict resolution was quite non-trivial, the RDMA tree's merge commit was used as a guide here, and here are their notes: ==================== Due to bug fixes found by the syzkaller bot and taken into the for-rc branch after development for the 4.17 merge window had already started being taken into the for-next branch, there were fairly non-trivial merge issues that would need to be resolved between the for-rc branch and the for-next branch. This merge resolves those conflicts and provides a unified base upon which ongoing development for 4.17 can be based. Conflicts: drivers/infiniband/hw/mlx5/main.c - Commit `42cea83f95` (IB/mlx5: Fix cleanup order on unload) added to for-rc and commit `b5ca15ad7e` (IB/mlx5: Add proper representors support) add as part of the devel cycle both needed to modify the init/de-init functions used by mlx5. To support the new representors, the new functions added by the cleanup patch needed to be made non-static, and the init/de-init list added by the representors patch needed to be modified to match the init/de-init list changes made by the cleanup patch. Updates: drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function prototypes added by representors patch to reflect new function names as changed by cleanup patch drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init stage list to match new order from cleanup patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-23 11:31:58 -04:00
Mimi Zohar	0834136aea	fuse: define the filesystem as untrusted Files on FUSE can change at any point in time without IMA being able to detect it. The file data read for the file signature verification could be totally different from what is subsequently read, making the signature verification useless. FUSE can be mounted by unprivileged users either today with fusermount installed with setuid, or soon with the upcoming patches to allow FUSE mounts in a non-init user namespace. This patch sets the SB_I_IMA_UNVERIFIABLE_SIGNATURE flag and when appropriate sets the SB_I_UNTRUSTED_MOUNTER flag. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Seth Forshee <seth.forshee@canonical.com> Cc: Dongsu Park <dongsu@kinvolk.io> Cc: Alban Crequy <alban@kinvolk.io> Acked-by: Serge Hallyn <serge@hallyn.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>	2018-03-23 06:31:37 -04:00
Linus Torvalds	f36b7534b8	Merge branch 'akpm' (patches from Andrew) Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, thp: do not cause memcg oom for thp mm/vmscan: wake up flushers for legacy cgroups too Revert "mm: page_alloc: skip over regions of invalid pfns where possible" mm/shmem: do not wait for lock_page() in shmem_unused_huge_shrink() mm/thp: do not wait for lock_page() in deferred_split_scan() mm/khugepaged.c: convert VM_BUG_ON() to collapse fail x86/mm: implement free pmd/pte page interfaces mm/vmalloc: add interfaces to free unmapped page table h8300: remove extraneous __BIG_ENDIAN definition hugetlbfs: check for pgoff value overflow lockdep: fix fs_reclaim warning MAINTAINERS: update Mark Fasheh's e-mail mm/mempolicy.c: avoid use uninitialized preferred_node	2018-03-22 18:48:43 -07:00
Mike Kravetz	63489f8e82	hugetlbfs: check for pgoff value overflow A vma with vm_pgoff large enough to overflow a loff_t type when converted to a byte offset can be passed via the remap_file_pages system call. The hugetlbfs mmap routine uses the byte offset to calculate reservations and file size. A sequence such as: mmap(0x20a00000, 0x600000, 0, 0x66033, -1, 0); remap_file_pages(0x20a00000, 0x600000, 0, 0x20000000000000, 0); will result in the following when task exits/file closed, kernel BUG at mm/hugetlb.c:749! Call Trace: hugetlbfs_evict_inode+0x2f/0x40 evict+0xcb/0x190 __dentry_kill+0xcb/0x150 __fput+0x164/0x1e0 task_work_run+0x84/0xa0 exit_to_usermode_loop+0x7d/0x80 do_syscall_64+0x18b/0x190 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 The overflowed pgoff value causes hugetlbfs to try to set up a mapping with a negative range (end < start) that leaves invalid state which causes the BUG. The previous overflow fix to this code was incomplete and did not take the remap_file_pages system call into account. [mike.kravetz@oracle.com: v3] Link: http://lkml.kernel.org/r/20180309002726.7248-1-mike.kravetz@oracle.com [akpm@linux-foundation.org: include mmdebug.h] [akpm@linux-foundation.org: fix -ve left shift count on sh] Link: http://lkml.kernel.org/r/20180308210502.15952-1-mike.kravetz@oracle.com Fixes: `045c7a3f53` ("hugetlbfs: fix offset overflow in hugetlbfs mmap") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: Nic Losby <blurbdust@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Yisheng Xie <xieyisheng1@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-03-22 17:07:01 -07:00
James Morris	5893ed18a2	Merge tag 'v4.16-rc6' into next-general Merge to Linux 4.16-rc6 at the request of Jarkko, for his TPM updates.	2018-03-23 08:26:16 +11:00
Linus Torvalds	c4f4d2f917	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Always validate XFRM esn replay attribute, from Florian Westphal. 2) Fix RCU read lock imbalance in xfrm_get_tos(), from Xin Long. 3) Don't try to get firmware dump if not loaded in iwlwifi, from Shaul Triebitz. 4) Fix BPF helpers to deal with SCTP GSO SKBs properly, from Daniel Axtens. 5) Fix some interrupt handling issues in e1000e driver, from Benjamin Poitier. 6) Use strlcpy() in several ethtool get_strings methods, from Florian Fainelli. 7) Fix rhlist dup insertion, from Paul Blakey. 8) Fix SKB leak in netem packet scheduler, from Alexey Kodanev. 9) Fix driver unload crash when link is up in smsc911x, from Jeremy Linton. 10) Purge out invalid socket types in l2tp_tunnel_create(), from Eric Dumazet. 11) Need to purge the write queue when TCP connections are aborted, otherwise userspace using MSG_ZEROCOPY can't close the fd. From Soheil Hassas Yeganeh. 12) Fix double free in error path of team driver, from Arkadi Sharshevsky. 13) Filter fixes for hv_netvsc driver, from Stephen Hemminger. 14) Fix non-linear packet access in ipv6 ndisc code, from Lorenzo Bianconi. 15) Properly filter out unsupported feature flags in macvlan driver, from Shannon Nelson. 16) Don't request loading the diag module for a protocol if the protocol itself is not even registered. From Xin Long. 17) If datagram connect fails in ipv6, make sure the socket state is consistent afterwards. From Paolo Abeni. 18) Use after free in qed driver, from Dan Carpenter. 19) If received ipv4 PMTU is less than the min pmtu, lock the mtu in the entry. From Sabrina Dubroca. 20) Fix sleep in atomic in tg3 driver, from Jonathan Toppins. 21) Fix vlan in vlan untagging in some situations, from Toshiaki Makita. 22) Fix double SKB free in genlmsg_mcast(). From Nicolas Dichtel. 23) Fix NULL derefs in error paths of tcf__init(), from Davide Caratti. 24) Unbalanced PM runtime calls in FEC driver, from Florian Fainelli. 25) Memory leak in gemini driver, from Igor Pylypiv. 26) IDR leaks in error paths of tcf__init() functions, from Davide Caratti. 27) Need to use GFP_ATOMIC in seg6_build_state(), from David Lebrun. 28) Missing dev_put() in error path of macsec_newlink(), from Dan Carpenter. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (201 commits) macsec: missing dev_put() on error in macsec_newlink() net: dsa: Fix functional dsa-loop dependency on FIXED_PHY hv_netvsc: common detach logic hv_netvsc: change GPAD teardown order on older versions hv_netvsc: use RCU to fix concurrent rx and queue changes hv_netvsc: disable NAPI before channel close net/ipv6: Handle onlink flag with multipath routes ppp: avoid loop in xmit recursion detection code ipv6: sr: fix NULL pointer dereference when setting encap source address ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state net: aquantia: driver version bump net: aquantia: Implement pci shutdown callback net: aquantia: Allow live mac address changes net: aquantia: Add tx clean budget and valid budget handling logic net: aquantia: Change inefficient wait loop on fw data reads net: aquantia: Fix a regression with reset on old firmware net: aquantia: Fix hardware reset when SPI may rarely hangup s390/qeth: on channel error, reject further cmd requests s390/qeth: lock read device while queueing next buffer s390/qeth: when thread completes, wake up all waiters ...	2018-03-22 14:10:29 -07:00
Eric Sandeen	0d9366d67b	ext4: don't complain about incorrect features when probing If mount is auto-probing for filesystem type, it will try various filesystems in order, with the MS_SILENT flag set. We get that flag as the silent arg to ext4_fill_super. If we're probing (silent==1) then don't complain about feature incompatibilities that are found if it looks like it's actually a different valid extN type - failed probes should be silent in this case. If the on-disk features are unknown even to ext4, then complain. Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Tested-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>	2018-03-22 11:59:00 -04:00
Nikolay Borisov	1d39834fba	ext4: remove EXT4_STATE_DIOREAD_LOCK flag Commit `16c5468859` ("ext4: Allow parallel DIO reads") reworked the way locking happens around parallel dio reads. This resulted in obviating the need for EXT4_STATE_DIOREAD_LOCK flag and accompanying logic. Currently this amounts to dead code so let's remove it. No functional changes Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>	2018-03-22 11:52:10 -04:00
Jiri Slaby	fe23cb65c2	ext4: fix offset overflow on 32-bit archs in ext4_iomap_begin() ext4_iomap_begin() has a bug where offset returned in the iomap structure will be truncated to unsigned long size. On 64-bit architectures this is fine but on 32-bit architectures obviously not. Not many places actually use the offset stored in the iomap structure but one of visible failures is in SEEK_HOLE / SEEK_DATA implementation. If we create a file like: dd if=/dev/urandom of=file bs=1k seek=8m count=1 then lseek64("file", 0x100000000ULL, SEEK_DATA) wrongly returns 0x100000000 on unfixed kernel while it should return 0x200000000. Avoid the overflow by proper type cast. Fixes: `545052e9e3` ("ext4: Switch to iomap for SEEK_HOLE / SEEK_DATA") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org # v4.15	2018-03-22 11:50:26 -04:00
Eryu Guan	45d8ec4d9f	ext4: update i_disksize if direct write past ondisk size Currently in ext4 direct write path, we update i_disksize only when new eof is greater than i_size, and don't update it even when new eof is greater than i_disksize but less than i_size. This doesn't work well with delalloc buffer write, which updates i_size and i_disksize only when delalloc blocks are resolved (at writeback time), the i_disksize from direct write can be lost if a previous buffer write succeeded at write time but failed at writeback time, then results in corrupted ondisk inode size. Consider this case, first buffer write 4k data to a new file at offset 16k with delayed allocation, then direct write 4k data to the same file at offset 4k before delalloc blocks are resolved, which doesn't update i_disksize because it writes within i_size(20k), but the extent tree metadata has been committed in journal. Then writeback of the delalloc blocks fails (due to device error etc.), and i_size/i_disksize from buffer write can't be written to disk (still zero). A subsequent umount/mount cycle recovers journal and writes extent tree metadata from direct write to disk, but with i_disksize being zero. Fix it by updating i_disksize too in direct write path when new eof is greater than i_disksize but less than i_size, so i_disksize is always consistent with direct write. This fixes occasional i_size corruption in fstests generic/475. Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2018-03-22 11:44:59 -04:00
Eryu Guan	73fdad00b2	ext4: protect i_disksize update by i_data_sem in direct write path i_disksize update should be protected by i_data_sem, by either taking the lock explicitly or by using ext4_update_i_disksize() helper. But the i_disksize updates in ext4_direct_IO_write() are not protected at all, which may be racing with i_disksize updates in writeback path in delalloc buffer write path. This is found by code inspection, and I didn't hit any i_disksize corruption due to this bug. Thanks to Jan Kara for catching this bug and suggesting the fix! Reported-by: Jan Kara <jack@suse.cz> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2018-03-22 11:41:25 -04:00
Richard Guy Briggs	ea841bafda	audit: add refused symlink to audit_names Audit link denied events for symlinks had duplicate PATH records rather than just updating the existing PATH record. Update the symlink's PATH record with the current dentry and inode information. See: https://github.com/linux-audit/audit-kernel/issues/21 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2018-03-21 11:31:03 -04:00
Richard Guy Briggs	94b9d9b7a1	audit: remove path param from link denied function In commit `45b578fe4c` ("audit: link denied should not directly generate PATH record") the need for the struct path *link parameter was removed. Remove the now useless struct path argument. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>	2018-03-21 11:17:41 -04:00
Linus Torvalds	645102eac1	Merge tag 'nfsd-4.16-1' of git://linux-nfs.org/~bfields/linux Pull nfsd fix from Bruce Fields: "Just one fix for an occasional panic from Jeff Layton" * tag 'nfsd-4.16-1' of git://linux-nfs.org/~bfields/linux: nfsd: remove blocked locks on client teardown	2018-03-20 16:10:26 -07:00
J. Bruce Fields	353601e7d3	nfsd: create a separate lease for each delegation Currently we only take one vfs-level delegation (lease) for each file, no matter how many clients hold delegations on that file. Let's instead keep a one-to-one mapping between NFSv4 delegations and VFS delegations. This turns out to be simpler. There is still a many-to-one mapping of NFS opens to NFS files, and the delegations on one file are all associated with one struct file. The VFS can still distinguish between these delegations since we're setting fl_owner to the struct nfs4_delegation now, not to the shared file. I'm replacing at least one complicated function wholesale, which I don't like to do, but I haven't figured out how to do this more incrementally. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:14 -04:00
J. Bruce Fields	86d29b10eb	nfsd: move sc_file assignment into alloc_init_deleg Take an easy chance to simplify the caller a little. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:13 -04:00
J. Bruce Fields	0af6e690f0	nfsd: factor out common delegation-destruction code Pull some duplicated code into a common helper. This changes the order in destroy_delegation a little, but it looks to me like that shouldn't matter. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:13 -04:00
J. Bruce Fields	68b18f5294	nfsd: make nfs4_get_existing_delegation less confusing This doesn't "get" anything. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:12 -04:00
J. Bruce Fields	0c911f5408	nfsd4: dp->dl_stid.sc_file doesn't need locking The delegation isn't visible to anyone yet. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:12 -04:00
J. Bruce Fields	653e514e9e	nfsd4: set fl_owner to delegation, not file pointer For now this makes no difference, as for files having delegations, there's a one-to-one relationship between an nfs4_file and its nfs4_delegation. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:11 -04:00
J. Bruce Fields	cba7b3d150	nfsd: simplify nfs4_put_deleg_lease calls Every single caller gets the file out of the delegation, so let's do that once in nfs4_put_deleg_lease. Plus we'll need it there for other reasons. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:11 -04:00
J. Bruce Fields	b8232d3315	nfsd: simplify put of fi_deleg_file fi_delegees is basically just a reference count on users of fi_deleg_file, which is cleared when fi_delegees goes to zero. The fi_deleg_file check here is redundant. Also add an assertion to make sure we don't have unbalanced puts. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>	2018-03-20 17:51:10 -04:00
Miklos Szeredi	bf5c1898bf	fuse: honor AT_STATX_FORCE_SYNC Force a refresh of attributes from the fuse server in this case. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2018-03-20 17:11:44 +01:00
Miklos Szeredi	ff1b89f389	fuse: honor AT_STATX_DONT_SYNC The description of this flag says "Don't sync attributes with the server". In other words: always use the attributes cached in the kernel and don't send network or local messages to refresh the attributes. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2018-03-20 17:11:44 +01:00
Seth Forshee	73f03c2b4b	fuse: Restrict allow_other to the superblock's namespace or a descendant Unprivileged users are normally restricted from mounting with the allow_other option by system policy, but this could be bypassed for a mount done with user namespace root permissions. In such cases allow_other should not allow users outside the userns to access the mount as doing so would give the unprivileged user the ability to manipulate processes it would otherwise be unable to manipulate. Restrict allow_other to apply to users in the same userns used at mount or a descendant of that namespace. Also export current_in_userns() for use by fuse when built as a module. Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Dongsu Park <dongsu@kinvolk.io> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2018-03-20 17:11:44 +01:00
Eric W. Biederman	8cb08329b0	fuse: Support fuse filesystems outside of init_user_ns In order to support mounts from namespaces other than init_user_ns, fuse must translate uids and gids to/from the userns of the process servicing requests on /dev/fuse. This patch does that, with a couple of restrictions on the namespace: - The userns for the fuse connection is fixed to the namespace from which /dev/fuse is opened. - The namespace must be the same as s_user_ns. These restrictions simplify the implementation by avoiding the need to pass around userns references and by allowing fuse to rely on the checks in setattr_prepare for ownership changes. Either restriction could be relaxed in the future if needed. For cuse the userns used is the opener of /dev/cuse. Semantically the cuse support does not appear safe for unprivileged users. Practically the permissions on /dev/cuse only make it accessible to the global root user. If something slips through the cracks in a user namespace the only users who will be able to use the cuse device are those users mapped into the user namespace. Translation in the posix acl is updated to use the uuser namespace of the filesystem. Avoiding cases which might bypass this translation is handled in a following change. This change is stronlgy based on a similar change from Seth Forshee and Dongsu Park. Cc: Seth Forshee <seth.forshee@canonical.com> Cc: Dongsu Park <dongsu@kinvolk.io> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2018-03-20 17:11:44 +01:00
Eric W. Biederman	c9582eb0ff	fuse: Fail all requests with invalid uids or gids Upon a cursory examinination the uid and gid of a fuse request are necessary for correct operation. Failing a fuse request where those values are not reliable seems a straight forward and reliable means of ensuring that fuse requests with bad data are not sent or processed. In most cases the vfs will avoid actions it suspects will cause an inode write back of an inode with an invalid uid or gid. But that does not map precisely to what fuse is doing, so test for this and solve this at the fuse level as well. Performing this work in fuse_req_init_context is cheap as the code is already performing the translation here and only needs to check the result of the translation to see if things are not representable in a form the fuse server can handle. [SzM] Don't zero the context for the nofail case, just keep using the munging version (makes sense for debugging and doesn't hurt). Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2018-03-20 17:11:44 +01:00
Eric W. Biederman	dbf107b2a7	fuse: Remove the buggy retranslation of pids in fuse_dev_do_read At the point of fuse_dev_do_read the user space process that initiated the action on the fuse filesystem may no longer exist. The process have been killed or may have fired an asynchronous request and exited. If the initial process has exited, the code "pid_vnr(find_pid_ns(in->h.pid, fc->pid_ns)" will either return a pid of 0, or in the unlikely event that the pid has been reallocated it can return practically any pid. Any pid is possible as the pid allocator allocates pid numbers in different pid namespaces independently. The only way to make translation in fuse_dev_do_read reliable is to call get_pid in fuse_req_init_context, and pid_vnr followed by put_pid in fuse_dev_do_read. That reference counting in other contexts has been shown to bounce cache lines between processors and in general be slow. So that is not desirable. The only known user of running the fuse server in a different pid namespace from the filesystem does not care what the pids are in the fuse messages so removing this code should not matter. Getting the translation to a server running outside of the pid namespace of a container can still be achieved by playing setns games at mount time. It is also possible to add an option to pass a pid namespace into the fuse filesystem at mount time. Fixes: `5d6d3a301c` ("fuse: allow server to run in different pid_ns") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2018-03-20 17:11:44 +01:00
Szymon Lukasz	3b7008b226	fuse: return -ECONNABORTED on /dev/fuse read after abort Currently the userspace has no way of knowing whether the fuse connection ended because of umount or abort via sysfs. It makes it hard for filesystems to free the mountpoint after abort without worrying about removing some new mount. The patch fixes it by returning different errors when userspace reads from /dev/fuse (-ENODEV for umount and -ECONNABORTED for abort). Add a new capability flag FUSE_ABORT_ERROR. If set and the connection is gone because of sysfs abort, reading from the device will return -ECONNABORTED. Signed-off-by: Szymon Lukasz <noh4hss@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>	2018-03-20 17:11:44 +01:00
Miklos Szeredi	df0e91d488	fuse: atomic_o_trunc should truncate pagecache Fuse has an "atomic_o_trunc" mode, where userspace filesystem uses the O_TRUNC flag in the OPEN request to truncate the file atomically with the open. In this mode there's no need to send a SETATTR request to userspace after the open, so fuse_do_setattr() checks this mode and returns. But this misses the important step of truncating the pagecache. Add the missing parts of truncation to the ATTR_OPEN branch. Reported-by: Chad Austin <chadaustin@fb.com> Fixes: `6ff958edbf` ("fuse: add atomic open+truncate support") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>	2018-03-20 17:11:43 +01:00
Greg Kroah-Hartman	4958134df5	Merge 4.16-rc6 into tty-next We want the serial/tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-03-20 11:27:18 +01:00
Peter Zijlstra	e24e960c7f	sched/wait, fs/ocfs2: Convert wait_on_atomic_t() usage to the new wait_var_event() API The old wait_on_atomic_t() is going to get removed, use the more flexible wait_var_event() API instead. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Joel Becker <jlbec@evilplan.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:23:22 +01:00
Peter Zijlstra	723c921e7d	sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API The old wait_on_atomic_t() is going to get removed, use the more flexible wait_var_event() API instead. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Anna Schumaker <anna.schumaker@netapp.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:23:21 +01:00
Peter Zijlstra	dc5d4afbb0	sched/wait, fs/fscache: Convert wait_on_atomic_t() usage to the new wait_var_event() API The old wait_on_atomic_t() is going to get removed, use the more flexible wait_var_event() API instead. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:23:21 +01:00
Peter Zijlstra	4625956a4e	sched/wait, fs/btrfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API The old wait_on_atomic_t() is going to get removed, use the more flexible wait_var_event() API instead. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David Sterba <dsterba@suse.com> Cc: Chris Mason <clm@fb.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:23:20 +01:00
Peter Zijlstra	ab1fbe3247	sched/wait, fs/afs: Convert wait_on_atomic_t() usage to the new wait_var_event() API The old wait_on_atomic_t() is going to get removed, use the more flexible wait_var_event() API instead. No change in functionality. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:23:19 +01:00
Grygorii Strashko	2399ac42e7	sysfs: symlink: export sysfs_create_link_nowarn() The sysfs_create_link_nowarn() is going to be used in phylib framework in subsequent patch which can be built as module. Hence, export sysfs_create_link_nowarn() to avoid build errors. Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Fixes: `a399546049` ("net: phy: Relax error checking on sysfs_create_link()") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-19 21:14:26 -04:00
Jeff Layton	9258a2d5cd	nfsd: move nfs4_client allocation to dedicated slabcache On x86_64, it's 1152 bytes, so we can avoid wasting 896 bytes each. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-03-19 16:38:13 -04:00
J. Bruce Fields	9d7ed1355d	nfsd: don't require low ports for gss requests In a traditional NFS deployment using auth_unix, the clients are trusted to correctly report the credentials of their logged-in users. The server assumes that only root on client machines is allowed to send requests from low-numbered ports, so it can use the originating port number to distinguish "real" NFS clients from NFS clients run by ordinary users, to prevent ordinary users from spoofing credentials. The originating port number on a gss-authenticated request is less important. The authentication ties the request to a user, and we take it as proof that that user authorized the request. The low port number check no longer adds much. So, don't enforce low port numbers in the auth_gss case. Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-03-19 16:38:13 -04:00
J. Bruce Fields	edcc8452a0	nfsd: remove unsused "cp_consecutive" field Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2018-03-19 16:38:13 -04:00

... 71 72 73 74 75 ...

56429 Commits