android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Darrick J. Wong	ea0cc6fa8f	xfs: refactor quota exceeded test Refactor the open-coded test for whether or not we're over quota. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	c8c753e19a	xfs: remove unnecessary arguments from quota adjust functions struct xfs_dquot already has a pointer to the xfs mount, so remove the redundant parameter from xfs_qm_adjust_dq*. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	438769e31e	xfs: refactor default quota limits by resource Now that we've split up the dquot resource fields into separate structs, do the same for the default limits to enable further refactoring. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	51dbb1be52	xfs: remove qcore from incore dquots Now that we've stopped using qcore entirely, drop it from the incore dquot. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	19dce7eaef	xfs: stop using q_core timers in the quota code Add timers fields to the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	c8c45fb2f6	xfs: stop using q_core warning counters in the quota code Add warning counter fields to the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	be37d40c1b	xfs: stop using q_core counters in the quota code Add counter fields to the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	d3537cf93e	xfs: stop using q_core limits in the quota code Add limits fields in the incore dquot, and use that instead of the ones in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	784e80f564	xfs: use a per-resource struct for incore dquot data Introduce a new struct xfs_dquot_res that we'll use to track all the incore data for a particular resource type (block, inode, rt block). This will help us (once we've eliminated q_core) to declutter quota functions that currently open-code field access or pass around fields around explicitly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	c51df73341	xfs: stop using q_core.d_id in the quota code Add a dquot id field to the incore dquot, and use that instead of the one in qcore. This eliminates a bunch of endian conversions and will eventually allow us to remove qcore entirely. We also rearrange the start of xfs_dquot to remove padding holes, saving 8 bytes. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Allison Collins <allison.henderson@oracle.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	0b0fa1d1d1	xfs: stop using q_core.d_flags in the quota code Use the incore dq_flags to figure out the dquot type. This is the first step towards removing xfs_disk_dquot from the incore dquot. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	cb64e12993	xfs: make XFS_DQUOT_CLUSTER_SIZE_FSB part of the ondisk format Move the dquot cluster size #define to xfs_format.h. It is an important part of the ondisk format because the ondisk dquot record size is not an even power of two, which means that the buffer size we use is significant here because the kernel leaves slack space at the end of the buffer to avoid having to deal with a dquot record crossing a block boundary. This is also an excuse to fix one of the longstanding discrepancies between kernel and userspace libxfs headers. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	985a78fdde	xfs: rename dquot incore state flags Rename the existing incore dquot "dq_flags" field to "q_flags" to match everything else in the structure, then move the two actual dquot state flags to the XFS_DQFLAG_ namespace from XFS_DQ_. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	0dcc0728c1	xfs: refactor quotacheck flags usage We only use the XFS_QMOPT flags in quotacheck to signal the quota type, so rip out all the flags handling and just pass the type all the way through. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	41ed4a5f2b	xfs: move the flags argument of xfs_qm_scall_trunc_qfiles to XFS_QMOPT_* Since xfs_qm_scall_trunc_qfiles can take a bitset of quota types that we want to truncate, change the flags argument to take XFS_QMOPT_[UGP}QUOTA so that the next patch can start to deprecate XFS_DQ_*. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	afeda6000b	xfs: validate ondisk/incore dquot flags While loading dquot records off disk, make sure that the quota type flags are the same between the incore dquot and the ondisk dquot. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	f959b5d037	xfs: fix inode quota reservation checks xfs_trans_dqresv is the function that we use to make reservations against resource quotas. Each resource contains two counters: the q_core counter, which tracks resources allocated on disk; and the dquot reservation counter, which tracks how much of that resource has either been allocated or reserved by threads that are working on metadata updates. For disk blocks, we compare the proposed reservation counter against the hard and soft limits to decide if we're going to fail the operation. However, for inodes we inexplicably compare against the q_core counter, not the incore reservation count. Since the q_core counter is always lower than the reservation count and we unlock the dquot between reservation and transaction commit, this means that multiple threads can reserve the last inode count before we hit the hard limit, and when they commit, we'll be well over the hard limit. Fix this by checking against the incore inode reservation counter, since we would appear to maintain that correctly (and that's what we report in GETQUOTA). Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-07-28 20:24:14 -07:00
Darrick J. Wong	c97738a960	xfs: clear XFS_DQ_FREEING if we can't lock the dquot buffer to flush In commit `8d3d7e2b35`, we changed xfs_qm_dqpurge to bail out if we can't lock the dquot buf to flush the dquot. This prevents the AIL from blocking on the dquot, but it also forgets to clear the FREEING flag on its way out. A subsequent purge attempt will see the FREEING flag is set and bail out, which leads to dqpurge_all failing to purge all the dquots. (copy-pasting from Dave Chinner's identical patch) This was found by inspection after having xfs/305 hang 1 in ~50 iterations in a quotaoff operation: [ 8872.301115] xfs_quota D13888 92262 91813 0x00004002 [ 8872.302538] Call Trace: [ 8872.303193] __schedule+0x2d2/0x780 [ 8872.304108] ? do_raw_spin_unlock+0x57/0xd0 [ 8872.305198] schedule+0x6e/0xe0 [ 8872.306021] schedule_timeout+0x14d/0x300 [ 8872.307060] ? __next_timer_interrupt+0xe0/0xe0 [ 8872.308231] ? xfs_qm_dqusage_adjust+0x200/0x200 [ 8872.309422] schedule_timeout_uninterruptible+0x2a/0x30 [ 8872.310759] xfs_qm_dquot_walk.isra.0+0x15a/0x1b0 [ 8872.311971] xfs_qm_dqpurge_all+0x7f/0x90 [ 8872.313022] xfs_qm_scall_quotaoff+0x18d/0x2b0 [ 8872.314163] xfs_quota_disable+0x3a/0x60 [ 8872.315179] kernel_quotactl+0x7e2/0x8d0 [ 8872.316196] ? __do_sys_newstat+0x51/0x80 [ 8872.317238] __x64_sys_quotactl+0x1e/0x30 [ 8872.318266] do_syscall_64+0x46/0x90 [ 8872.319193] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8872.320490] RIP: 0033:0x7f46b5490f2a [ 8872.321414] Code: Bad RIP value. Returning -EAGAIN from xfs_qm_dqpurge() without clearing the XFS_DQ_FREEING flag means the xfs_qm_dqpurge_all() code can never free the dquot, and we loop forever waiting for the XFS_DQ_FREEING flag to go away on the dquot that leaked it via -EAGAIN. Fixes: `8d3d7e2b35` ("xfs: trylock underlying buffer on dquot flush") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2020-07-28 20:24:14 -07:00
Brian Foster	b2a8864728	xfs: fix inode allocation block res calculation precedence The block reservation calculation for inode allocation is supposed to consist of the blocks required for the inode chunk plus (maxlevels-1) of the inode btree multiplied by the number of inode btrees in the fs (2 when finobt is enabled, 1 otherwise). Instead, the macro returns (ialloc_blocks + 2) due to a precedence error in the calculation logic. This leads to block reservation overruns via generic/531 on small block filesystems with finobt enabled. Add braces to fix the calculation and reserve the appropriate number of blocks. Fixes: `9d43b180af` ("xfs: update inode allocation/free transaction reservations for finobt") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-07-28 20:24:14 -07:00
Brian Foster	f376b45e86	xfs: drain the buf delwri queue before xfsaild idles xfsaild is racy with respect to transaction abort and shutdown in that the task can idle or exit with an empty AIL but buffers still on the delwri queue. This was partly addressed by cancelling the delwri queue before the task exits to prevent memory leaks, but it's also possible for xfsaild to empty and idle with buffers on the delwri queue. For example, a transaction that pins a buffer that also happens to sit on the AIL delwri queue will explicitly remove the associated log item from the AIL if the transaction aborts. The side effect of this is an unmount hang in xfs_wait_buftarg() as the associated buffers remain held by the delwri queue indefinitely. This is reproduced on repeated runs of generic/531 with an fs format (-mrmapbt=1 -bsize=1k) that happens to also reproduce transaction aborts. Update xfsaild to not idle until both the AIL and associated delwri queue are empty and update the push code to continue delwri queue submission attempts even when the AIL is empty. This allows the AIL to eventually release aborted buffers stranded on the delwri queue when they are unlocked by the associated transaction. This should have no significant effect on normal runtime behavior because the xfsaild currently idles only when the AIL is empty and in practice the AIL is rarely empty with a populated delwri queue. The items must be AIL resident to land in the queue in the first place and generally aren't removed until writeback completes. Note that the pre-existing delwri queue cancel logic in the exit path is retained because task stop is external, could technically come at any point, and xfsaild is still responsible to release its buffer references before it exits. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2020-07-28 20:24:14 -07:00
Ira Weiny	c7fe193f18	fs/dax: Remove unused size parameter Passing size to copy_user_dax implies it can copy variable sizes of data when in fact it calls copy_user_page() which is exactly a page. We are safe because the only caller uses PAGE_SIZE anyway so just remove the variable for clarity. While we are at it change copy_user_dax() to copy_cow_page_dax() to make it clear it is a singleton helper for this one case not implementing what dax_iomap_actor() does. Link: https://lore.kernel.org/r/20200717072056.73134-11-ira.weiny@intel.com Reviewed-by: Ben Widawsky <ben.widawsky@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>	2020-07-28 11:49:29 -06:00
Mike Marshall	476af91933	orangefs: posix acl fix... Al Viro pointed out that I broke some acl functionality... * ACLs could not be fully removed * posix_acl_chmod would be called while the old ACL was still cached * new mode propagated to orangefs server before ACL. ... when I tried to make sure that modes that got changed as a result of ACL-sets would be sent back to the orangefs server. Not wanting to try and change the code without having some cases to test it with, I began to hunt for setfacl examples that were expressible in pure mode. Along the way I found examples like the following which confused me: user A had a file (/home/A/asdf) with mode 740 user B was in user A's group user C was not in user A's group setfacl -m u:C:rwx /home/A/asdf The above setfacl caused ls -l /home/A/asdf to show a mode of 770, making it appear that all users in user A's group now had full access to /home/A/asdf, however, user B still only had read acces. Madness. Anywho, I finally found that the above (whacky as it is) appears to be "posixly on purpose" and explained in acl(5): If the ACL has an ACL_MASK entry, the group permissions correspond to the permissions of the ACL_MASK entry. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2020-07-28 12:52:53 -04:00
Colin Ian King	9a74a2b87f	NFS: remove redundant initialization of variable result The variable result is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>	2020-07-28 11:04:06 -04:00
Jan Kara	8aed8cebdd	fanotify: compare fsid when merging name event When merging name events, fsids of the two involved events have to match. Otherwise we could merge events from two different filesystems and thus effectively loose the second event. Backporting note: Although the commit `cacfb956d4` introducing this bug was merged for 5.7, the relevant code didn't get used in the end until `7e8283af6e` ("fanotify: report parent fid + name + child fid") which will be merged with this patch. So there's no need for backporting this. Fixes: `cacfb956d4` ("fanotify: record name info for FAN_DIR_MODIFY event") Reported-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-28 10:58:07 +02:00
Amir Goldstein	b9a1b97725	fsnotify: create method handle_inode_event() in fsnotify_operations The method handle_event() grew a lot of complexity due to the design of fanotify and merging of ignore masks. Most backends do not care about this complex functionality, so we can hide this complexity from them. Introduce a method handle_inode_event() that serves those backends and passes a single inode mark and less arguments. This change converts all backends except fanotify and inotify to use the simplified handle_inode_event() method. In pricipal, inotify could have also used the new method, but that would require passing more arguments on the simple helper (data, data_type, cookie), so we leave it with the handle_event() method. Link: https://lore.kernel.org/r/20200722125849.17418-9-amir73il@gmail.com Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:25:50 +02:00
Amir Goldstein	691d976352	fanotify: report parent fid + child fid Add support for FAN_REPORT_FID \| FAN_REPORT_DIR_FID. Internally, it is implemented as a private case of reporting both parent and child fids and name, the parent and child fids are recorded in a variable length fanotify_name_event, but there is no name. It should be noted that directory modification events are recorded in fixed size fanotify_fid_event when not reporting name, just like with group flags FAN_REPORT_FID. Link: https://lore.kernel.org/r/20200716084230.30611-23-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:24:01 +02:00
Amir Goldstein	7e8283af6e	fanotify: report parent fid + name + child fid For a group with fanotify_init() flag FAN_REPORT_DFID_NAME, the parent fid and name are reported for events on non-directory objects with an info record of type FAN_EVENT_INFO_TYPE_DFID_NAME. If the group also has the init flag FAN_REPORT_FID, the child fid is also reported with another info record that follows the first info record. The second info record is the same info record that would have been reported to a group with only FAN_REPORT_FID flag. When the child fid needs to be recorded, the variable size struct fanotify_name_event is preallocated with enough space to store the child fh between the dir fh and the name. Link: https://lore.kernel.org/r/20200716084230.30611-22-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:24:00 +02:00
Amir Goldstein	929943b38d	fanotify: add support for FAN_REPORT_NAME Introduce a new fanotify_init() flag FAN_REPORT_NAME. It requires the flag FAN_REPORT_DIR_FID and there is a constant for setting both flags named FAN_REPORT_DFID_NAME. For a group with flag FAN_REPORT_NAME, the parent fid and name are reported for directory entry modification events (create/detete/move) and for events on non-directory objects. Events on directories themselves are reported with their own fid and "." as the name. The parent fid and name are reported with an info record of type FAN_EVENT_INFO_TYPE_DFID_NAME, similar to the way that parent fid is reported with into type FAN_EVENT_INFO_TYPE_DFID, but with an appended null terminated name string. Link: https://lore.kernel.org/r/20200716084230.30611-21-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:24:00 +02:00
Amir Goldstein	5128063739	fanotify: report events with parent dir fid to sb/mount/non-dir marks In a group with flag FAN_REPORT_DIR_FID, when adding an inode mark with FAN_EVENT_ON_CHILD, events on non-directory children are reported with the fid of the parent. When adding a filesystem or mount mark or mark on a non-dir inode, we want to report events that are "possible on child" (e.g. open/close) also with fid of the parent, as if the victim inode's parent is interested in events "on child". Some events, currently only FAN_MOVE_SELF, should be reported to a sb/mount/non-dir mark with parent fid even though they are not reported to a watching parent. To get the desired behavior we set the flag FAN_EVENT_ON_CHILD on all the sb/mount/non-dir mark masks in a group with FAN_REPORT_DIR_FID. Link: https://lore.kernel.org/r/20200716084230.30611-20-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:24:00 +02:00
Amir Goldstein	83b7a59896	fanotify: add basic support for FAN_REPORT_DIR_FID For now, the flag is mutually exclusive with FAN_REPORT_FID. Events include a single info record of type FAN_EVENT_INFO_TYPE_DFID with a directory file handle. For now, events are only reported for: - Directory modification events - Events on children of a watching directory - Events on directory objects Soon, we will add support for reporting the parent directory fid for events on non-directories with filesystem/mount mark and support for reporting both parent directory fid and child fid. Link: https://lore.kernel.org/r/20200716084230.30611-19-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:24:00 +02:00
Amir Goldstein	9b93f33105	fsnotify: send event with parent/name info to sb/mount/non-dir marks Similar to events "on child" to watching directory, send event with parent/name info if sb/mount/non-dir marks are interested in parent/name info. The FS_EVENT_ON_CHILD flag can be set on sb/mount/non-dir marks to specify interest in parent/name info for events on non-directory inodes. Events on "orphan" children (disconnected dentries) are sent without parent/name info. Events on directories are sent with parent/name info only if the parent directory is watching. After this change, even groups that do not subscribe to events on children could get an event with mark iterator type TYPE_CHILD and without mark iterator type TYPE_INODE if fanotify has marks on the same objects. dnotify and inotify event handlers can already cope with that situation. audit does not subscribe to events that are possible on child, so won't get to this situation. nfsd does not access the marks iterator from its event handler at the moment, so it is not affected. This is a bit too fragile, so we should prepare all groups to cope with mark type TYPE_CHILD preferably using a generic helper. Link: https://lore.kernel.org/r/20200716084230.30611-16-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:21:02 +02:00
Amir Goldstein	957f7b472c	inotify: do not set FS_EVENT_ON_CHILD in non-dir mark mask FS_EVENT_ON_CHILD has currently no meaning for non-dir inode marks. In the following patches we want to use that bit to mean that mark's notification group cares about parent and name information. So stop setting FS_EVENT_ON_CHILD for non-dir marks. Link: https://lore.kernel.org/r/20200722125849.17418-3-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:16:16 +02:00
Amir Goldstein	40a100d3ad	fsnotify: pass dir and inode arguments to fsnotify() The arguments of fsnotify() are overloaded and mean different things for different event types. Replace the to_tell argument with separate arguments @dir and @inode, because we may be sending to both dir and child. Using the @data argument to pass the child is not enough, because dirent events pass this argument (for audit), but we do not report to child. Document the new fsnotify() function argumenets. Link: https://lore.kernel.org/r/20200722125849.17418-7-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:15:48 +02:00
Amir Goldstein	82ace1efb3	fsnotify: create helper fsnotify_inode() Simple helper to consolidate biolerplate code. Link: https://lore.kernel.org/r/20200722125849.17418-5-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 23:13:51 +02:00
Amir Goldstein	497b0c5a7c	fsnotify: send event to parent and child with single callback Instead of calling fsnotify() twice, once with parent inode and once with child inode, if event should be sent to parent inode, send it with both parent and child inodes marks in object type iterator and call the backend handle_event() callback only once. The parent inode is assigned to the standard "inode" iterator type and the child inode is assigned to the special "child" iterator type. In that case, the bit FS_EVENT_ON_CHILD will be set in the event mask, the dir argument to handle_event will be the parent inode, the file_name argument to handle_event is non NULL and refers to the name of the child and the child inode can be accessed with fsnotify_data_inode(). This will allow fanotify to make decisions based on child or parent's ignored mask. For example, when a parent is interested in a specific event on its children, but a specific child wishes to ignore this event, the event will not be reported. This is not what happens with current code, but according to man page, it is the expected behavior. Link: https://lore.kernel.org/r/20200716084230.30611-15-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:24:52 +02:00
Amir Goldstein	c8f3446c66	inotify: report both events on parent and child with single callback fsnotify usually calls inotify_handle_event() once for watching parent to report event with child's name and once for watching child to report event without child's name. Do the same thing with a single callback instead of two callbacks when marks iterator contains both inode and child entries. Link: https://lore.kernel.org/r/20200716084230.30611-13-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:24:51 +02:00
Amir Goldstein	62cb0af4ce	dnotify: report both events on parent and child with single callback For some events (e.g. DN_ATTRIB on sub-directory) fsnotify may call dnotify_handle_event() once for watching parent and once again for the watching sub-directory. Do the same thing with a single callback instead of two callbacks when marks iterator contains both inode and child entries. Link: https://lore.kernel.org/r/20200716084230.30611-12-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:24:51 +02:00
Amir Goldstein	f35c415678	fanotify: no external fh buffer in fanotify_name_event The fanotify_fh struct has an inline buffer of size 12 which is enough to store the most common local filesystem file handles (e.g. ext4, xfs). For file handles that do not fit in the inline buffer (e.g. btrfs), an external buffer is allocated to store the file handle. When allocating a variable size fanotify_name_event, there is no point in allocating also an external fh buffer when file handle does not fit in the inline buffer. Check required size for encoding fh, preallocate an event buffer sufficient to contain both file handle and name and store the name after the file handle. At this time, when not reporting name in event, we still allocate the fixed size fanotify_fid_event and an external buffer for large file handles, but fanotify_alloc_name_event() has already been prepared to accept a NULL file_name. Link: https://lore.kernel.org/r/20200716084230.30611-11-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:37 +02:00
Amir Goldstein	f454fa610a	fanotify: use struct fanotify_info to parcel the variable size buffer An fanotify event name is always recorded relative to a dir fh. Encapsulate the name_len member of fanotify_name_event in a new struct fanotify_info, which describes the parceling of the variable size buffer of an fanotify_name_event. The dir_fh member of fanotify_name_event is renamed to _dir_fh and is not accessed directly, but via the fanotify_info_dir_fh() accessor. Although the dir_fh len information is already available in struct fanotify_fh, we store it also in dif_fh_totlen member of fanotify_info, including the size of fanotify_fh header, so we know the offset of the name in the buffer without looking inside the dir_fh. We also add a file_fh_totlen member to allow packing another file handle in the variable size buffer after the dir_fh and before the name. We are going to use that space to store the child fid. Link: https://lore.kernel.org/r/20200716084230.30611-10-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:37 +02:00
Amir Goldstein	85af5d9258	fanotify: use FAN_EVENT_ON_CHILD as implicit flag on sb/mount/non-dir marks Up to now, fanotify allowed to set the FAN_EVENT_ON_CHILD flag on sb/mount marks and non-directory inode mask, but the flag was ignored. Mask out the flag if it is provided by user on sb/mount/non-dir marks and define it as an implicit flag that cannot be removed by user. This flag is going to be used internally to request for events with parent and name info. Link: https://lore.kernel.org/r/20200716084230.30611-8-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:37 +02:00
Amir Goldstein	4ed6814a91	fanotify: prepare for implicit event flags in mark mask So far, all flags that can be set in an fanotify mark mask can be set explicitly by a call to fanotify_mark(2). Prepare for defining implicit event flags that cannot be set by user with fanotify_mark(2), similar to how inotify/dnotify implicitly set the FS_EVENT_ON_CHILD flag. Implicit event flags cannot be removed by user and mark gets destroyed when only implicit event flags remain in the mask. Link: https://lore.kernel.org/r/20200716084230.30611-7-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:36 +02:00
Amir Goldstein	3ef8665366	fanotify: mask out special event flags from ignored mask The special event flags (FAN_ONDIR, FAN_EVENT_ON_CHILD) never had any meaning in ignored mask. Mask them out explicitly. Link: https://lore.kernel.org/r/20200716084230.30611-6-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:36 +02:00
Amir Goldstein	d809daf1b6	fanotify: generalize test for FAN_REPORT_FID As preparation for new flags that report fids, define a bit set of flags for a group reporting fids, currently containing the only bit FAN_REPORT_FID. Link: https://lore.kernel.org/r/20200716084230.30611-5-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:36 +02:00
Amir Goldstein	6ad1aadd97	fanotify: distinguish between fid encode error and null fid In fanotify_encode_fh(), both cases of NULL inode and failure to encode ended up with fh type FILEID_INVALID. Distiguish the case of NULL inode, by setting fh type to FILEID_ROOT. This is just a semantic difference at this point. Remove stale comment and unneeded check from fid event compare helpers. Link: https://lore.kernel.org/r/20200716084230.30611-4-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:36 +02:00
Amir Goldstein	103ff6a554	fanotify: generalize merge logic of events on dir An event on directory should never be merged with an event on non-directory regardless of the event struct type. This change has no visible effect, because currently, with struct fanotify_path_event, the relevant events will not be merged because event path of dir will be different than event path of non-dir. Link: https://lore.kernel.org/r/20200716084230.30611-3-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:36 +02:00
Amir Goldstein	0badfa029e	fanotify: generalize the handling of extra event flags In fanotify_group_event_mask() there is logic in place to make sure we are not going to handle an event with no type and just FAN_ONDIR flag. Generalize this logic to any FANOTIFY_EVENT_FLAGS. There is only one more flag in this group at the moment - FAN_EVENT_ON_CHILD. We never report it to user, but we do pass it in to fanotify_alloc_event() when group is reporting fid as indication that event happened on child. We will have use for this indication later on. Link: https://lore.kernel.org/r/20200716084230.30611-2-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:36 +02:00
Amir Goldstein	08b95c338e	fanotify: remove event FAN_DIR_MODIFY It was never enabled in uapi and its functionality is about to be superseded by events FAN_CREATE, FAN_DELETE, FAN_MOVE with group flag FAN_REPORT_NAME. Keep a place holder variable name_event instead of removing the name recording code since it will be used by the new events. Link: https://lore.kernel.org/r/20200708111156.24659-17-amir73il@gmail.com Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2020-07-27 21:23:36 +02:00
Al Viro	1697a322e2	[elf-fdpic] switch coredump to regsets similar to how elf coredump is working on architectures that have regsets, and all architectures with elf-fdpic support do have that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-07-27 14:29:24 -04:00
Al Viro	d2f581684a	[elf-fdpic] use elf_dump_thread_status() for the dumper thread as well the only reason to have it open-coded for the first (dumper) thread is that coredump has a couple of process-wide notes stuck right after the first (NT_PRSTATUS) note of the first thread. But we don't need to make the data collection side irregular for the first thread to handle that - it's only the logics ordering the calls of writenote() that needs to take care of that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-07-27 14:29:23 -04:00
Al Viro	38a62779ae	[elf-fdpic] move allocation of elf_thread_status into elf_dump_thread_status() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-07-27 14:29:23 -04:00

... 7 8 9 10 11 ...

66232 Commits