android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Theodore Ts'o	97abd7d4b5	ext4: preserve the needs_recovery flag when the journal is aborted If the journal is aborted, the needs_recovery feature flag should not be removed. Otherwise, it's the journal might not get replayed and this could lead to more data getting lost. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2017-02-04 23:38:06 -05:00
Theodore Ts'o	e112666b49	jbd2: don't leak modified metadata buffers on an aborted journal If the journal has been aborted, we shouldn't mark the underlying buffer head as dirty, since that will cause the metadata block to get modified. And if the journal has been aborted, we shouldn't allow this since it will almost certainly lead to a corrupted file system. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2017-02-04 23:14:19 -05:00
Theodore Ts'o	eb5efbcb76	ext4: fix inline data error paths The write_end() function must always unlock the page and drop its ref count, even on an error. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org	2017-02-04 23:04:00 -05:00
Hou Tao	4dd2eb6335	xfs: reset b_first_retry_time when clear the retry status of xfs_buf_t After successful IO or permanent error, b_first_retry_time also needs to be cleared, else the invalid first retry time will be used by the next retry check. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-02-03 14:39:07 -08:00
Michal Hocko	d1908f5255	fs: break out of iomap_file_buffered_write on fatal signals Tetsuo has noticed that an OOM stress test which performs large write requests can cause the full memory reserves depletion. He has tracked this down to the following path __alloc_pages_nodemask+0x436/0x4d0 alloc_pages_current+0x97/0x1b0 __page_cache_alloc+0x15d/0x1a0 mm/filemap.c:728 pagecache_get_page+0x5a/0x2b0 mm/filemap.c:1331 grab_cache_page_write_begin+0x23/0x40 mm/filemap.c:2773 iomap_write_begin+0x50/0xd0 fs/iomap.c:118 iomap_write_actor+0xb5/0x1a0 fs/iomap.c:190 ? iomap_write_end+0x80/0x80 fs/iomap.c:150 iomap_apply+0xb3/0x130 fs/iomap.c:79 iomap_file_buffered_write+0x68/0xa0 fs/iomap.c:243 ? iomap_write_end+0x80/0x80 xfs_file_buffered_aio_write+0x132/0x390 [xfs] ? remove_wait_queue+0x59/0x60 xfs_file_write_iter+0x90/0x130 [xfs] __vfs_write+0xe5/0x140 vfs_write+0xc7/0x1f0 ? syscall_trace_enter+0x1d0/0x380 SyS_write+0x58/0xc0 do_syscall_64+0x6c/0x200 entry_SYSCALL64_slow_path+0x25/0x25 the oom victim has access to all memory reserves to make a forward progress to exit easier. But iomap_file_buffered_write and other callers of iomap_apply loop to complete the full request. We need to check for fatal signals and back off with a short write instead. As the iomap_apply delegates all the work down to the actor we have to hook into those. All callers that work with the page cache are calling iomap_write_begin so we will check for signals there. dax_iomap_actor has to handle the situation explicitly because it copies data to the userspace directly. Other callers like iomap_page_mkwrite work on a single page or iomap_fiemap_actor do not allocate memory based on the given len. Fixes: `68a9f5e700` ("xfs: implement iomap based buffered write path") Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-02-03 14:13:19 -08:00
Jan Kara	70823b9bf3	orangefs: Remove orangefs_backing_dev_info It is not used anywhere. CC: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-02-03 14:38:35 -05:00
Martin Brandenburg	31c829f364	orangefs: Support readahead_readcnt parameter. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-02-03 14:38:28 -05:00
Dan Carpenter	eb82fbcf82	orangefs: silence harmless integer overflow warning The issue here is that in orangefs_bufmap_alloc() we do: bufmap->buffer_index_array = kzalloc(DIV_ROUND_UP(bufmap->desc_count, BITS_PER_LONG), GFP_KERNEL); If we choose a bufmap->desc_count like -31 then it means the DIV_ROUND_UP ends up having an integer overflow. The result is that kzalloc() returns the ZERO_SIZE_PTR and there is a static checker warning. But this bug is harmless because on the next lines we use ->desc_count to do a kcalloc(). That has integer overflow checking built in so the kcalloc() fails and we return an error code. Anyway, it doesn't make sense to talk about negative sizes and blocking them silences the static checker warning. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2017-02-03 14:37:15 -05:00
Fabian Frederick	a074faad51	udf: simplify udf_ioctl() "out" label was only returning error code. Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz>	2017-02-03 16:24:18 +01:00
Fabian Frederick	782deb2eec	udf: fix ioctl errors Currently, lsattr for instance in udf directory gives "udf: Invalid argument While reading flags on ..." This patch returns -ENOIOCTLCMD when command is unknown to have more accurate message like this: "Inappropriate ioctl for device While reading flags on ..." Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Jan Kara <jack@suse.cz>	2017-02-03 16:19:54 +01:00
Andrew Price	c548a1c175	gfs2: Make gfs2_write_full_page static It only gets called from aops.c and doesn't appear in any headers. Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com>	2017-02-03 08:23:47 -05:00
Eric W. Biederman	1064f874ab	mnt: Tuck mounts under others instead of creating shadow/side mounts. Ever since mount propagation was introduced in cases where a mount in propagated to parent mount mountpoint pair that is already in use the code has placed the new mount behind the old mount in the mount hash table. This implementation detail is problematic as it allows creating arbitrary length mount hash chains. Furthermore it invalidates the constraint maintained elsewhere in the mount code that a parent mount and a mountpoint pair will have exactly one mount upon them. Making it hard to deal with and to talk about this special case in the mount code. Modify mount propagation to notice when there is already a mount at the parent mount and mountpoint where a new mount is propagating to and place that preexisting mount on top of the new mount. Modify unmount propagation to notice when a mount that is being unmounted has another mount on top of it (and no other children), and to replace the unmounted mount with the mount on top of it. Move the MNT_UMUONT test from __lookup_mnt_last into __propagate_umount as that is the only call of __lookup_mnt_last where MNT_UMOUNT may be set on any mount visible in the mount hash table. These modifications allow: - __lookup_mnt_last to be removed. - attach_shadows to be renamed __attach_mnt and its shadow handling to be removed. - commit_tree to be simplified - copy_tree to be simplified The result is an easier to understand tree of mounts that does not allow creation of arbitrary length hash chains in the mount hash table. The result is also a very slight userspace visible difference in semantics. The following two cases now behave identically, where before order mattered: case 1: (explicit user action) B is a slave of A mount something on A/a , it will propagate to B/a and than mount something on B/a case 2: (tucked mount) B is a slave of A mount something on B/a and than mount something on A/a Histroically umount A/a would fail in case 1 and succeed in case 2. Now umount A/a succeeds in both configurations. This very small change in semantics appears if anything to be a bug fix to me and my survey of userspace leads me to believe that no programs will notice or care of this subtle semantic change. v2: Updated to mnt_change_mountpoint to not call dput or mntput and instead to decrement the counts directly. It is guaranteed that there will be other references when mnt_change_mountpoint is called so this is safe. v3: Moved put_mountpoint under mount_lock in attach_recursive_mnt As the locking in fs/namespace.c changed between v2 and v3. v4: Reworked the logic in propagate_mount_busy and __propagate_umount that detects when a mount completely covers another mount. v5: Removed unnecessary tests whose result is alwasy true in find_topper and attach_recursive_mnt. v6: Document the user space visible semantic difference. Cc: stable@vger.kernel.org Fixes: `b90fa9ae8f` ("[PATCH] shared mount handling: bind and rbind") Tested-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-02-04 00:01:06 +13:00
Eric W. Biederman	015bb305b8	Merge branch 'nsfs-discovery' Michael Kerrisk <<mtk.manpages@gmail.com> writes: I would like to write code that discovers the namespace setup on a live system. The NS_GET_PARENT and NS_GET_USERNS ioctl() operations added in Linux 4.9 provide much of what I want, but there are still a couple of small pieces missing. Those pieces are added with this patch series. Here's an example program that makes use of the new ioctl() operations. 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x--- /* ns_capable.c (C) 2016 Michael Kerrisk, <mtk.manpages@gmail.com> Licensed under the GNU General Public License v2 or later. Test whether a process (identified by PID) might (subject to LSM checks) have capabilities in a namespace (identified by a /proc/PID/ns/xxx file). / } while (0) exit(EXIT_FAILURE); } while (0) / Display capabilities sets of process with specified PID / static void show_cap(pid_t pid) { cap_t caps; char cap_string; caps = cap_get_pid(pid); if (caps == NULL) errExit("cap_get_proc"); cap_string = cap_to_text(caps, NULL); if (cap_string == NULL) errExit("cap_to_text"); printf("Capabilities: %s\n", cap_string); } /* Obtain the effective UID pf the process 'pid' by scanning its /proc/PID/file / static uid_t get_euid_of_process(pid_t pid) { char path[PATH_MAX]; char line[1024]; int uid; snprintf(path, sizeof(path), "/proc/%ld/status", (long) pid); FILE fp; fp = fopen(path, "r"); if (fp == NULL) errExit("fopen-/proc/PID/status"); for (;;) { if (fgets(line, sizeof(line), fp) == NULL) { /* Should never happen... / fprintf(stderr, "Failure scanning %s\n", path); exit(EXIT_FAILURE); } if (strstr(line, "Uid:") == line) { sscanf(line, "Uid: %d %d %d %d", &uid); return uid; } } } int main(int argc, char argv[]) { int ns_fd, userns_fd, pid_userns_fd; int nstype; int next_fd; struct stat pid_stat; struct stat target_stat; char pid_str; pid_t pid; char path[PATH_MAX]; if (argc < 2) { fprintf(stderr, "Usage: %s PID [ns-file]\n", argv[0]); fprintf(stderr, "\t'ns-file' is a /proc/PID/ns/xxxx file; " "if omitted, use the namespace\n" "\treferred to by standard input " "(file descriptor 0)\n"); exit(EXIT_FAILURE); } pid_str = argv[1]; pid = atoi(pid_str); if (argc <= 2) { ns_fd = STDIN_FILENO; } else { ns_fd = open(argv[2], O_RDONLY); if (ns_fd == -1) errExit("open-ns-file"); } /* Get the relevant user namespace FD, which is 'ns_fd' if 'ns_fd' refers to a user namespace, otherwise the user namespace that owns 'ns_fd' / nstype = ioctl(ns_fd, NS_GET_NSTYPE); if (nstype == -1) errExit("ioctl-NS_GET_NSTYPE"); if (nstype == CLONE_NEWUSER) { userns_fd = ns_fd; } else { userns_fd = ioctl(ns_fd, NS_GET_USERNS); if (userns_fd == -1) errExit("ioctl-NS_GET_USERNS"); } / Obtain 'stat' info for the user namespace of the specified PID / snprintf(path, sizeof(path), "/proc/%s/ns/user", pid_str); pid_userns_fd = open(path, O_RDONLY); if (pid_userns_fd == -1) errExit("open-PID"); if (fstat(pid_userns_fd, &pid_stat) == -1) errExit("fstat-PID"); / Get 'stat' info for the target user namesapce / if (fstat(userns_fd, &target_stat) == -1) errExit("fstat-PID"); / If the PID is in the target user namespace, then it has whatever capabilities are in its sets. / if (pid_stat.st_dev == target_stat.st_dev && pid_stat.st_ino == target_stat.st_ino) { printf("PID is in target namespace\n"); printf("Subject to LSM checks, it has the following capabilities\n"); show_cap(pid); exit(EXIT_SUCCESS); } / Otherwise, we need to walk through the ancestors of the target user namespace to see if PID is in an ancestor namespace / for (;;) { int f; next_fd = ioctl(userns_fd, NS_GET_PARENT); if (next_fd == -1) { / The error here should be EPERM... / if (errno != EPERM) errExit("ioctl-NS_GET_PARENT"); printf("PID is not in an ancestor namespace\n"); printf("It has no capabilities in the target namespace\n"); exit(EXIT_SUCCESS); } if (fstat(next_fd, &target_stat) == -1) errExit("fstat-PID"); / If the 'stat' info for this user namespace matches the 'stat' * info for 'next_fd', then the PID is in an ancestor namespace / if (pid_stat.st_dev == target_stat.st_dev && pid_stat.st_ino == target_stat.st_ino) break; / Next time round, get the next parent / f = userns_fd; userns_fd = next_fd; close(f); } / At this point, we found that PID is in an ancestor of the target user namespace, and 'userns_fd' refers to the immediate descendant user namespace of PID in the chain of user namespaces from PID to the target user namespace. If the effective UID of PID matches the owner UID of descendant user namespace, then PID has all capabilities in the descendant namespace(s); otherwise, it just has the capabilities that are in its sets. */ uid_t owner_uid, uid; if (ioctl(userns_fd, NS_GET_OWNER_UID, &owner_uid) == -1) { perror("ioctl-NS_GET_OWNER_UID"); exit(EXIT_FAILURE); } uid = get_euid_of_process(pid); printf("PID is in an ancestor namespace\n"); if (owner_uid == uid) { printf("And its effective UID matches the owner " "of the namespace\n"); printf("Subject to LSM checks, PID has all capabilities in " "that namespace!\n"); } else { printf("But its effective UID does not match the owner " "of the namespace\n"); printf("Subject to LSM checks, it has the following capabilities\n"); show_cap(pid); } exit(EXIT_SUCCESS); } 8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x---8x--- Michael Kerrisk (2): nsfs: Add an ioctl() to return the namespace type nsfs: Add an ioctl() to return owner UID of a userns fs/nsfs.c \| 13 +++++++++++++ include/uapi/linux/nsfs.h \| 9 +++++++-- 2 files changed, 20 insertions(+), 2 deletions(-)	2017-02-03 14:53:27 +13:00
Michael Kerrisk (man-pages)	d95fa3c76a	nsfs: Add an ioctl() to return owner UID of a userns I'd like to write code that discovers the user namespace hierarchy on a running system, and also shows who owns the various user namespaces. Currently, there is no way of getting the owner UID of a user namespace. Therefore, this patch adds a new NS_GET_CREATOR_UID ioctl() that fetches the UID (as seen in the user namespace of the caller) of the creator of the user namespace referred to by the specified file descriptor. If the supplied file descriptor does not refer to a user namespace, the operation fails with the error EINVAL. If the owner UID does not have a mapping in the caller's user namespace return the overflow UID as that appears easier to deal with in practice in user-space applications. -- EWB Changed the handling of unmapped UIDs from -EOVERFLOW back to the overflow uid. Per conversation with Michael Kerrisk after examining his test code. Acked-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: Michael Kerrisk <mtk-manpages@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2017-02-03 14:35:43 +13:00
Darrick J. Wong	5eda430000	xfs: mark speculative prealloc CoW fork extents unwritten Christoph Hellwig pointed out that there's a potentially nasty race when performing simultaneous nearby directio cow writes: "Thread 1 writes a range from B to c " B --------- C p "a little later thread 2 writes from A to B " A --------- B p [editor's note: the 'p' denote cowextsize boundaries, which I added to make this more clear] "but the code preallocates beyond B into the range where thread "1 has just written, but ->end_io hasn't been called yet. "But once ->end_io is called thread 2 has already allocated "up to the extent size hint into the write range of thread 1, "so the end_io handler will splice the unintialized blocks from "that preallocation back into the file right after B." We can avoid this race by ensuring that thread 1 cannot accidentally remap the blocks that thread 2 allocated (as part of speculative preallocation) as part of t2's write preparation in t1's end_io handler. The way we make this happen is by taking advantage of the unwritten extent flag as an intermediate step. Recall that when we begin the process of writing data to shared blocks, we create a delayed allocation extent in the CoW fork: D: --RRRRRRSSSRRRRRRRR--- C: ------DDDDDDD--------- When a thread prepares to CoW some dirty data out to disk, it will now convert the delalloc reservation into an /unwritten/ allocated extent in the cow fork. The da conversion code tries to opportunistically allocate as much of a (speculatively prealloc'd) extent as possible, so we may end up allocating a larger extent than we're actually writing out: D: --RRRRRRSSSRRRRRRRR--- U: ------UUUUUUU--------- Next, we convert only the part of the extent that we're actively planning to write to normal (i.e. not unwritten) status: D: --RRRRRRSSSRRRRRRRR--- U: ------UURRUUU--------- If the write succeeds, the end_cow function will now scan the relevant range of the CoW fork for real extents and remap only the real extents into the data fork: D: --RRRRRRRRSRRRRRRRR--- U: ------UU--UUU--------- This ensures that we never obliterate valid data fork extents with unwritten blocks from the CoW fork. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-02-02 15:14:02 -08:00
Darrick J. Wong	05a630d76b	xfs: allow unwritten extents in the CoW fork In the data fork, we only allow extents to perform the following state transitions: delay -> real <-> unwritten There's no way to move directly from a delalloc reservation to an /unwritten/ allocated extent. However, for the CoW fork we want to be able to do the following to each extent: delalloc -> unwritten -> written -> remapped to data fork This will help us to avoid a race in the speculative CoW preallocation code between a first thread that is allocating a CoW extent and a second thread that is remapping part of a file after a write. In order to do this, however, we need two things: first, we have to be able to transition from da to unwritten, and second the function that converts between real and unwritten has to be made aware of the cow fork. Do both of those things. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-02-02 15:14:01 -08:00
Darrick J. Wong	de14c5f541	xfs: verify free block header fields Perform basic sanity checking of the directory free block header fields so that we avoid hanging the system on invalid data. (Granted that just means that now we shutdown on directory write, but that seems better than hanging...) Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-02-02 15:14:00 -08:00
Darrick J. Wong	b3bf607d58	xfs: check for obviously bad level values in the bmbt root We can't handle a bmbt that's taller than BTREE_MAXLEVELS, and there's no such thing as a zero-level bmbt (for that we have extents format), so if we see this, send back an error code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-02-02 15:13:59 -08:00
Darrick J. Wong	d5a91baeb6	xfs: filter out obviously bad btree pointers Don't let anybody load an obviously bad btree pointer. Since the values come from disk, we must return an error, not just ASSERT. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>	2017-02-02 15:13:58 -08:00
Darrick J. Wong	7a652bbe36	xfs: fail _dir_open when readahead fails When we open a directory, we try to readahead block 0 of the directory on the assumption that we're going to need it soon. If the bmbt is corrupt, the directory will never be usable and the readahead fails immediately, so we might as well prevent the directory from being opened at all. This prevents a subsequent read or modify operation from hitting it and taking the fs offline. NOTE: We're only checking for early failures in the block mapping, not the readahead directory block itself. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-02-02 15:13:58 -08:00
Darrick J. Wong	4b5bd5bf3f	xfs: fix toctou race when locking an inode to access the data map We use di_format and if_flags to decide whether we're grabbing the ilock in btree mode (btree extents not loaded) or shared mode (anything else), but the state of those fields can be changed by other threads that are also trying to load the btree extents -- IFEXTENTS gets set before the _bmap_read_extents call and cleared if it fails. We don't actually need to have IFEXTENTS set until after the bmbt records are successfully loaded and validated, which will fix the race between multiple threads trying to read the same directory. The next patch strengthens directory bmbt validation by refusing to open the directory if reading the bmbt to start directory readahead fails. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-02-02 15:13:57 -08:00
David S. Miller	e2160156bf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net All merge conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-02-02 16:54:00 -05:00
Linus Torvalds	1fc576b82b	Merge tag 'nfsd-4.10-2' of git://linux-nfs.org/~bfields/linux Pull nfsd fixes from Bruce Fields: "Three more miscellaneous nfsd bugfixes" * tag 'nfsd-4.10-2' of git://linux-nfs.org/~bfields/linux: svcrpc: fix oops in absence of krb5 module nfsd: special case truncates some more NFSD: Fix a null reference case in find_or_create_lock_stateid()	2017-02-02 12:49:58 -08:00
Omar Sandoval	a7c5437b0b	debugfs: add debugfs_lookup() We don't always have easy access to the dentry of a file or directory we created in debugfs. Add a helper which allows us to get a dentry we previously created. The motivation for this change is a problem with blktrace and the blk-mq debugfs entries introduced in `07e4fead45` ("blk-mq: create debugfs directory tree"). Namely, in some cases, the directory that blktrace needs to create may already exist, but in other cases, it may not. We _could_ rely on a bunch of implied knowledge to decide whether to create the directory or not, but it's much cleaner on our end to just look it up. Signed-off-by: Omar Sandoval <osandov@fb.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-02 10:20:16 -07:00
Jason A. Donenfeld	1c83a9aab8	ext4: move halfmd4 into hash.c directly The "half md4" transform should not be used by any new code. And fortunately, it's only used now by ext4. Since ext4 supports several hashing methods, at some point it might be desirable to move to something like SipHash. As an intermediate step, remove half md4 from cryptohash.h and lib, and make it just a local function in ext4's hash.c. There's precedent for doing this; the other function ext can use for its hashes -- TEA -- is also implemented in the same place. Also, by being a local function, this might allow gcc to perform some additional optimizations. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2017-02-02 11:52:14 -05:00
Jan Kara	efa7c9f97e	block: Get rid of blk_get_backing_dev_info() blk_get_backing_dev_info() is now a simple dereference. Remove that function and simplify some code around that. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-02 08:21:32 -07:00
Jan Kara	b1d2dc5659	block: Make blk_get_backing_dev_info() safe without open bdev Currenly blk_get_backing_dev_info() is not safe to be called when the block device is not open as bdev->bd_disk is NULL in that case. However inode_to_bdi() uses this function and may be call called from flusher worker or other writeback related functions without bdev being open which leads to crashes such as: [113031.075540] Unable to handle kernel paging request for data at address 0x00000000 [113031.075614] Faulting instruction address: 0xc0000000003692e0 0:mon> t [c0000000fb65f900] c00000000036cb6c writeback_sb_inodes+0x30c/0x590 [c0000000fb65fa10] c00000000036ced4 __writeback_inodes_wb+0xe4/0x150 [c0000000fb65fa70] c00000000036d33c wb_writeback+0x30c/0x450 [c0000000fb65fb40] c00000000036e198 wb_workfn+0x268/0x580 [c0000000fb65fc50] c0000000000f3470 process_one_work+0x1e0/0x590 [c0000000fb65fce0] c0000000000f38c8 worker_thread+0xa8/0x660 [c0000000fb65fd80] c0000000000fc4b0 kthread+0x110/0x130 [c0000000fb65fe30] c0000000000098f0 ret_from_kernel_thread+0x5c/0x6c Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-02 08:20:53 -07:00
Jan Kara	dc3b17cc8b	block: Use pointer to backing_dev_info from request_queue We will want to have struct backing_dev_info allocated separately from struct request_queue. As the first step add pointer to backing_dev_info to request_queue and convert all users touching it. No functional changes in this patch. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-02 08:20:48 -07:00
Jan Kara	f44f1ab5a2	block: Unhash block device inodes on gendisk destruction Currently, block device inodes stay around after corresponding gendisk hash died until memory reclaim finds them and frees them. Since we will make block device inode pin the bdi, we want to free the block device inode as soon as the device goes away so that bdi does not stay around unnecessarily. Furthermore we need to avoid issues when new device with the same major,minor pair gets created since reusing the bdi structure would be rather difficult in this case. Unhashing block device inode on gendisk destruction nicely deals with these problems. Once last block device inode reference is dropped (which may be directly in del_gendisk()), the inode gets evicted. Furthermore if the major,minor pair gets reallocated, we are guaranteed to get new block device inode even if old block device inode is not yet evicted and thus we avoid issues with possible reuse of bdi. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-02 08:18:41 -07:00
Eric Biggers	dd01b690f8	ext4: fix use-after-iput when fscrypt contexts are inconsistent In the case where the child's encryption context was inconsistent with its parent directory, we were using inode->i_sb and inode->i_ino after the inode had already been iput(). Fix this by doing the iput() in the correct places. Note: only ext4 had this bug, not f2fs and ubifs. Fixes: `d9cdc90331` ("ext4 crypto: enforce context consistency") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2017-02-01 21:07:11 -05:00
Sahitya Tummala	dbfcef6b0f	jbd2: fix use after free in kjournald2() Below is the synchronization issue between unmount and kjournald2 contexts, which results into use after free issue in kjournald2(). Fix this issue by using journal->j_state_lock to synchronize the wait_event() done in journal_kill_thread() and the wake_up() done in kjournald2(). TASK 1: umount cmd: \|--jbd2_journal_destroy() { \|--journal_kill_thread() { write_lock(&journal->j_state_lock); journal->j_flags \|= JBD2_UNMOUNT; ... write_unlock(&journal->j_state_lock); wake_up(&journal->j_wait_commit); TASK 2 wakes up here: kjournald2() { ... checks JBD2_UNMOUNT flag and calls goto end-loop; ... end_loop: write_unlock(&journal->j_state_lock); journal->j_task = NULL; --> If this thread gets pre-empted here, then TASK 1 wait_event will exit even before this thread is completely done. wait_event(journal->j_wait_done_commit, journal->j_task == NULL); ... write_lock(&journal->j_state_lock); write_unlock(&journal->j_state_lock); } \|--kfree(journal); } } wake_up(&journal->j_wait_done_commit); --> this step now results into use after free issue. } Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2017-02-01 20:49:35 -05:00
Pavel Shilovsky	ae6f8dd4d0	CIFS: Allow to switch on encryption with seal mount option This allows users to inforce encryption for SMB3 shares if a server supports it. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:37 -06:00
Pavel Shilovsky	c42a6abe30	CIFS: Add capability to decrypt big read responses Allow to decrypt transformed packets that are bigger than the big buffer size. In particular it is used for read responses that can only exceed the big buffer size. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:37 -06:00
Pavel Shilovsky	4326ed2f6a	CIFS: Decrypt and process small encrypted packets Allow to decrypt transformed packets, find a corresponding mid and process as usual further. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:36 -06:00
Pavel Shilovsky	d70b9104b1	CIFS: Add copy into pages callback for a read operation Since we have two different types of reads (pagecache and direct) we need to process such responses differently after decryption of a packet. The change allows to specify a callback that copies a read payload data into preallocated pages. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:36 -06:00
Pavel Shilovsky	9b7c18a2d4	CIFS: Add mid handle callback We need to process read responses differently because the data should go directly into preallocated pages. This can be done by specifying a mid handle callback. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:36 -06:00
Pavel Shilovsky	9bb17e0916	CIFS: Add transform header handling callbacks We need to recognize and parse transformed packets in demultiplex thread to find a corresponsing mid and process it further. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:36 -06:00
Pavel Shilovsky	026e93dc0a	CIFS: Encrypt SMB3 requests before sending This change allows to encrypt packets if it is required by a server for SMB sessions or tree connections. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:36 -06:00
Pavel Shilovsky	cabfb3680f	CIFS: Enable encryption during session setup phase In order to allow encryption on SMB connection we need to exchange a session key and generate encryption and decryption keys. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:36 -06:00
Pavel Shilovsky	7fb8986e74	CIFS: Add capability to transform requests before sending This will allow us to do protocol specific tranformations of packets before sending to the server. For SMB3 it can be used to support encryption. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:35 -06:00
Pavel Shilovsky	b8f57ee8aa	CIFS: Separate RFC1001 length processing for SMB2 read Allocate and initialize SMB2 read request without RFC1001 length field to directly call cifs_send_recv() rather than SendReceive2() in a read codepath. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:35 -06:00
Pavel Shilovsky	cb200bd626	CIFS: Separate SMB2 sync header processing Do not process RFC1001 length in smb2_hdr_assemble() because it is not a part of SMB2 header. This allows to cleanup the code and adds a possibility combine several SMB2 packets into one for compounding. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:35 -06:00
Pavel Shilovsky	738f9de5cd	CIFS: Send RFC1001 length in a separate iov In order to simplify further encryption support we need to separate RFC1001 length and SMB2 header when sending a request. Put the length field in iov[0] and the rest of the packet into following iovs. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:35 -06:00
Pavel Shilovsky	fb2036d817	CIFS: Make send_cancel take rqst as argument Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:35 -06:00
Pavel Shilovsky	da502f7df0	CIFS: Make SendReceive2() takes resp iov Now SendReceive2 frees the first iov and returns a response buffer in it that increases a code complexity. Simplify this by making a caller responsible for freeing request buffer itself and returning a response buffer in a separate iov. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:34 -06:00
Pavel Shilovsky	31473fc4f9	CIFS: Separate SMB2 header structure In order to support compounding and encryption we need to separate RFC1001 length field and SMB2 header structure because the protocol treats them differently. This change will allow to simplify parsing of such complex SMB2 packets further. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:34 -06:00
Pavel Shilovsky	9c25702cee	CIFS: Fix splice read for non-cached files Currently we call copy_page_to_iter() for uncached reading into a pipe. This is wrong because it treats pages as VFS cache pages and copies references rather than actual data. When we are trying to read from the pipe we end up calling page_cache_pipe_buf_confirm() which returns -ENODATA. This error is translated into 0 which is returned to a user. This issue is reproduced by running xfs-tests suite (generic test #249) against mount points with "cache=none". Fix it by mapping pages manually and calling copy_to_iter() that copies data into the pipe. Cc: Stable <stable@vger.kernel.org> Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>	2017-02-01 16:46:34 -06:00
Jean Delvare	b9be76d585	cifs: Add soft dependencies List soft dependencies of cifs so that mkinitrd and dracut can include the required helper modules. Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Steve French <sfrench@samba.org>	2017-02-01 16:46:34 -06:00
Jean Delvare	3692304bba	cifs: Only select the required crypto modules The sha256 and cmac crypto modules are only needed for SMB2+, so move the select statements to config CIFS_SMB2. Also select CRYPTO_AES there as SMB2+ needs it. Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Steve French <sfrench@samba.org>	2017-02-01 16:46:34 -06:00
Jean Delvare	c1ecea8747	cifs: Simplify SMB2 and SMB311 dependencies * CIFS_SMB2 depends on CIFS, which depends on INET and selects NLS. So these dependencies do not need to be repeated for CIFS_SMB2. * CIFS_SMB311 depends on CIFS_SMB2, which depends on INET. So this dependency doesn't need to be repeated for CIFS_SMB311. Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Steve French <sfrench@samba.org>	2017-02-01 16:46:33 -06:00

... 13 14 15 16 17 ...

48450 Commits