android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Mimi Zohar	39eeb4fb97	security: define kernel_read_file hook The kernel_read_file security hook is called prior to reading the file into memory. Changelog v4+: - export security_kernel_read_file() Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com>	2016-02-21 09:06:09 -05:00
Mimi Zohar	09596b94f7	vfs: define kernel_read_file_from_path This patch defines kernel_read_file_from_path(), a wrapper for the VFS common kernel_read_file(). Changelog: - revert error msg regression - reported by Sergey Senozhatsky - Separated from the IMA patch Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk>	2016-02-21 08:55:00 -05:00
Linus Torvalds	0389075ecf	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "This is unusually large, partly due to the EFI fixes that prevent accidental deletion of EFI variables through efivarfs that may brick machines. These fixes are somewhat involved to maintain compatibility with existing install methods and other usage modes, while trying to turn off the 'rm -rf' bricking vector. Other fixes are for large page ioremap()s and for non-temporal user-memcpy()s" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix vmalloc_fault() to handle large pages properly hpet: Drop stale URLs x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache() x86/uaccess/64: Make the __copy_user_nocache() assembly code more readable lib/ucs2_string: Correct ucs2 -> utf8 conversion efi: Add pstore variables to the deletion whitelist efi: Make efivarfs entries immutable by default efi: Make our variable validation list include the guid efi: Do variable name validation tests in utf8 efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version lib/ucs2_string: Add ucs2 -> utf8 helper functions	2016-02-20 09:32:40 -08:00
Maxim Patlasov	7ae8fd0351	fs/pnode.c: treat zero mnt_group_id-s as unequal propagate_one(m) calculates "type" argument for copy_tree() like this: > if (m->mnt_group_id == last_dest->mnt_group_id) { > type = CL_MAKE_SHARED; > } else { > type = CL_SLAVE; > if (IS_MNT_SHARED(m)) > type \|= CL_MAKE_SHARED; > } The "type" argument then governs clone_mnt() behavior with respect to flags and mnt_master of new mount. When we iterate through a slave group, it is possible that both current "m" and "last_dest" are not shared (although, both are slaves, i.e. have non-NULL mnt_master-s). Then the comparison above erroneously makes new mount shared and sets its mnt_master to last_source->mnt_master. The patch fixes the problem by handling zero mnt_group_id-s as though they are unequal. The similar problem exists in the implementation of "else" clause above when we have to ascend upward in the master/slave tree by calling: > last_source = last_source->mnt_master; > last_dest = last_source->mnt_parent; proper number of times. The last step is governed by "n->mnt_group_id != last_dest->mnt_group_id" condition that may lie if both are zero. The patch fixes this case in the same way as the former one. [AV: don't open-code an obvious helper...] Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-02-20 00:15:52 -05:00
Al Viro	0bacbe528e	affs_do_readpage_ofs(): just use kmap_atomic() around memcpy() It forgets kunmap() on a failure exit, but there's really no point keeping the page kmapped at all - after all, what we are doing is a bunch of memcpy() into the parts of page, so kmap_atomic()/kunmap_atomic() just around those memcpy() is enough. Spotted-by: Insu Yun <wuninsu@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-02-20 00:15:51 -05:00
Mateusz Guzik	0e9a7da51b	xattr handlers: plug a lock leak in simple_xattr_list The code could leak xattrs->lock on error. Problem introduced with `786534b92f` "tmpfs: listxattr should include POSIX ACL xattrs". Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-02-20 00:15:51 -05:00
Wouter van Kesteren	2feb55f890	fs: allow no_seek_end_llseek to actually seek The user-visible impact of the issue is for example that without this patch sensors-detect breaks when trying to seek in /dev/cpu/0/cpuid. '~0ULL' is a 'unsigned long long' that when converted to a loff_t, which is signed, gets turned into -1. later in vfs_setpos we have 'if (offset > maxsize)', which makes it always return EINVAL. Fixes: `b25472f9b9` ("new helpers: no_seek_end_llseek{,_size}()") Signed-off-by: Wouter van Kesteren <woutershep@gmail.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-02-20 00:15:50 -05:00
Linus Torvalds	020ecbba05	Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bugfixes from Ted Ts'o: "Miscellaneous ext4 bug fixes for v4.5" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix crashes in dioread_nolock mode ext4: fix bh->b_state corruption ext4: fix memleak in ext4_readdir() ext4: remove unused parameter "newblock" in convert_initialized_extent() ext4: don't read blocks from disk after extents being swapped ext4: fix potential integer overflow ext4: add a line break for proc mb_groups display ext4: ioctl: fix erroneous return value ext4: fix scheduling in atomic on group checksum failure ext4 crypto: move context consistency check to ext4_file_open() ext4 crypto: revalidate dentry after adding or removing the key	2016-02-19 13:44:12 -08:00
Linus Torvalds	ce6b71432d	Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fix from Chris Mason: "My for-linus-4.5 branch has a btrfs DIO error passing fix. I know how much you love DIO, so I'm going to suggest against reading it. We'll follow up with a patch to drop the error arg from dio_end_io in the next merge window." * 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix direct IO requests not reporting IO error to user space	2016-02-19 13:40:42 -08:00
Linus Torvalds	87d9ac712b	Merge branch 'akpm' (patches from Andrew) Merge fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: slab: free kmem_cache_node after destroy sysfs file ipc/shm: handle removed segments gracefully in shm_mmap() MAINTAINERS: update Kselftest Framework mailing list devm_memremap_release(): fix memremap'd addr handling mm/hugetlb.c: fix incorrect proc nr_hugepages value mm, x86: fix pte_page() crash in gup_pte_range() fsnotify: turn fsnotify reaper thread into a workqueue job Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread" mm: fix regression in remap_file_pages() emulation thp, dax: do not try to withdraw pgtable from non-anon VMA	2016-02-19 13:36:00 -08:00
Al Viro	c1223ca48b	orangefs: get rid of op refcounts not needed anymore Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:56 -05:00
Al Viro	05a50a5be8	orangefs: have ..._clean_interrupted_...() wait for copy to/from daemon * turn all those list_del(&op->list) into list_del_init() * don't pick ops that are already given up in control device ->read()/->write_iter(). * have orangefs_clean_interrupted_operation() notice if op is currently being copied to/from daemon (by said ->read()/->write_iter()) and wait for that to finish. * when we are done copying to/from daemon and find that it had been given up while we were doing that, wake the waiting ..._clean_interrupted_... As the result, we are guaranteed that orangefs_clean_interrupted_operation(op) doesn't return until nobody else can see op. Moreover, we don't need to play with op refcounts anymore. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:56 -05:00
Al Viro	5964c1b839	orangefs: set correct ->downcall.status on failing to copy reply from daemon ... and clean the end of control device ->write_iter() while we are at it Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:55 -05:00
Mike Marshall	ddb84da38d	Orangefs: remove vestigial ASYNC code Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:55 -05:00
Mike Marshall	5253487e04	Orangefs: make some gossip statements more helpful. Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:55 -05:00
Al Viro	897c5df6cf	orangefs: get rid of op->done shouldn't be needed now Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:55 -05:00
Al Viro	82d37f19ff	orangefs_readdir_index_put(): get rid of bufmap argument Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:54 -05:00
Al Viro	ea2c9c9f65	orangefs: bufmap rewrite new waiting-for-slot logics: * make request for slot wait for bufmap to be set up if it comes before it's installed OR while it's running down * make closing control device wait for all slots to be freed * waiting itself rewritten to (open-coded) analogues of wait_event_... primitives - we would need wait_event_locked() and, pardon an obscenely long name, wait_event_interruptible_exclusive_timeout_locked(). * we never wait for more than slot_timeout_secs in total and, if during the wait the daemon goes away, we only allow ORANGEFS_BUFMAP_WAIT_TIMEOUT_SECS for it to come back. * (cosmetical) bitmap is used instead of an array of zeroes and ones * old (and only reached if we are about to corrupt memory) waiting for daemon restart in service_operation() removed. [Martin's fixes folded] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:54 -05:00
Al Viro	178041848a	orangefs_bufmap_..._query(): don't bother with refcounts ... just hold the spinlock while fetching the field in question. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:54 -05:00
Al Viro	05b39a8b5c	orangefs: lift handling of timeouts and attempts count to service_operation() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:54 -05:00
Al Viro	c72f15b7d9	service_operation(): don't block signals, just use ..._killable Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:53 -05:00
Al Viro	98815ade9e	orangefs: sanitize handling of request list * checking that daemon is running (to decide whether we want to limit the timeout) should be done after the damn thing is included into the list; doing that before means that if the daemon gets shut down in between, we'll end up waiting indefinitely (== up to kill -9). * cancels should go into the head of the queue - the sooner they are picked, the less work daemon has to do and the sooner we get to free the slot held by aborted operation. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:53 -05:00
Al Viro	d2d87a3b6d	orangefs: get rid of loop in wait_for_matching_downcall() turn op->waitq into struct completion... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:53 -05:00
Martin Brandenburg	cf22644a0e	orangefs: use S_ISREG(mode) and friends instead of mode & S_IFREG. Suggestion from Dan Carpenter. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:53 -05:00
Al Viro	78699e29fd	orangefs: delay freeing slot until cancel completes Make cancels reuse the aborted read/write op, to make sure they do not fail on lack of memory. Don't issue a cancel unless the daemon has seen our read/write, has not replied and isn't being shut down. If cancel is issued, don't wait for it to complete; stash the slot in there and just have it freed when cancel is finally replied to or purged (and delay dropping the reference until then, obviously). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Mike Marshall <hubcap@omnibond.com>	2016-02-19 13:45:53 -05:00
Eric Sandeen	6332b9b5e7	ext4: Make Q_GETNEXTQUOTA work for quota in hidden inodes We forgot to set .get_nextdqblk operation in quotactl_ops structure used by ext4 when quota is using hidden inode thus the operation was not really supported. Fix the omission. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Jan Kara <jack@suse.cz>	2016-02-19 19:28:07 +01:00
Jan Kara	74dae42785	ext4: fix crashes in dioread_nolock mode Competing overwrite DIO in dioread_nolock mode will just overwrite pointer to io_end in the inode. This may result in data corruption or extent conversion happening from IO completion interrupt because we don't properly set buffer_defer_completion() when unlocked DIO races with locked DIO to unwritten extent. Since unlocked DIO doesn't need io_end for anything, just avoid allocating it and corrupting pointer from inode for locked DIO. A cleaner fix would be to avoid these games with io_end pointer from the inode but that requires more intrusive changes so we leave that for later. Cc: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2016-02-19 00:33:21 -05:00
Jan Kara	ed8ad83808	ext4: fix bh->b_state corruption ext4 can update bh->b_state non-atomically in _ext4_get_block() and ext4_da_get_block_prep(). Usually this is fine since bh is just a temporary storage for mapping information on stack but in some cases it can be fully living bh attached to a page. In such case non-atomic update of bh->b_state can race with an atomic update which then gets lost. Usually when we are mapping bh and thus updating bh->b_state non-atomically, nobody else touches the bh and so things work out fine but there is one case to especially worry about: ext4_finish_bio() uses BH_Uptodate_Lock on the first bh in the page to synchronize handling of PageWriteback state. So when blocksize < pagesize, we can be atomically modifying bh->b_state of a buffer that actually isn't under IO and thus can race e.g. with delalloc trying to map that buffer. The result is that we can mistakenly set / clear BH_Uptodate_Lock bit resulting in the corruption of PageWriteback state or missed unlock of BH_Uptodate_Lock. Fix the problem by always updating bh->b_state bits atomically. CC: stable@vger.kernel.org Reported-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2016-02-19 00:18:25 -05:00
Jeff Layton	0918f1c309	fsnotify: turn fsnotify reaper thread into a workqueue job We don't require a dedicated thread for fsnotify cleanup. Switch it over to a workqueue job instead that runs on the system_unbound_wq. In the interest of not thrashing the queued job too often when there are a lot of marks being removed, we delay the reaper job slightly when queueing it, to allow several to gather on the list. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Tested-by: Eryu Guan <guaneryu@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-02-18 16:23:24 -08:00
Jeff Layton	13d34ac6e5	Revert "fsnotify: destroy marks with call_srcu instead of dedicated thread" This reverts commit `c510eff6be` ("fsnotify: destroy marks with call_srcu instead of dedicated thread"). Eryu reported that he was seeing some OOM kills kick in when running a testcase that adds and removes inotify marks on a file in a tight loop. The above commit changed the code to use call_srcu to clean up the marks. While that does (in principle) work, the srcu callback job is limited to cleaning up entries in small batches and only once per jiffy. It's easily possible to overwhelm that machinery with too many call_srcu callbacks, and Eryu's reproduer did just that. There's also another potential problem with using call_srcu here. While you can obviously sleep while holding the srcu_read_lock, the callbacks run under local_bh_disable, so you can't sleep there. It's possible when putting the last reference to the fsnotify_mark that we'll end up putting a chain of references including the fsnotify_group, uid, and associated keys. While I don't see any obvious ways that that could occurs, it's probably still best to avoid using call_srcu here after all. This patch reverts the above patch. A later patch will take a different approach to eliminated the dedicated thread here. Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Reported-by: Eryu Guan <guaneryu@gmail.com> Tested-by: Eryu Guan <guaneryu@gmail.com> Cc: Jan Kara <jack@suse.com> Cc: Eric Paris <eparis@parisplace.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-02-18 16:23:24 -08:00
Mimi Zohar	bc8ca5b92d	vfs: define kernel_read_file_id enumeration To differentiate between the kernel_read_file() callers, this patch defines a new enumeration named kernel_read_file_id and includes the caller identifier as an argument. Subsequent patches define READING_KEXEC_IMAGE, READING_KEXEC_INITRAMFS, READING_FIRMWARE, READING_MODULE, and READING_POLICY. Changelog v3: - Replace the IMA specific enumeration with a generic one. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk>	2016-02-18 17:14:04 -05:00
Mimi Zohar	b44a7dfc6f	vfs: define a generic function to read a file from the kernel For a while it was looked down upon to directly read files from Linux. These days there exists a few mechanisms in the kernel that do just this though to load a file into a local buffer. There are minor but important checks differences on each. This patch set is the first attempt at resolving some of these differences. This patch introduces a common function for reading files from the kernel with the corresponding security post-read hook and function. Changelog v4+: - export security_kernel_post_read_file() - Fengguang Wu v3: - additional bounds checking - Luis v2: - To simplify patch review, re-ordered patches Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: Luis R. Rodriguez <mcgrof@suse.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Al Viro <viro@zeniv.linux.org.uk>	2016-02-18 17:14:03 -05:00
Dave Hansen	c1192f8428	x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps The protection key can now be just as important as read/write permissions on a VMA. We need some debug mechanism to help figure out if it is in play. smaps seems like a logical place to expose it. arch/x86/kernel/setup.c is a bit of a weirdo place to put this code, but it already had seq_file.h and there was not a much better existing place to put it. We also use no #ifdef. If protection keys is .config'd out we will effectively get the same function as if we used the weak generic function. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Joerg Roedel <jroedel@suse.de> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Mark Williamson <mwilliamson@undo-software.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20160212210227.4F8EB3F8@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-02-18 19:46:29 +01:00
Jan Kara	ccf370e43e	quota: Forbid Q_GETQUOTA and Q_GETNEXTQUOTA for frozen filesystem Commit `7955118eaf` (quota: Allow Q_GETQUOTA for frozen filesystem) allowed Q_GETQUOTA call for frozen filesystem. It makes sense on the first look but zero-day testing has shown that with this change ext4 warns about starting a transaction for frozen filesystem. This happens because ext4_acquire_dquot() prepares for allocating space for new quota structure. Although it would be possible to implement Q_GETQUOTA for ext4 without allocating space for non-existent structures, the matter further complicates because OCFS2 needs to update on-disk structure use count when a new cluster node loads quota information from disk. So just revert the change and forbid Q_GETQUOTA together with Q_GETNEXTQUOTA for frozen filesystem. Add comment to quotactl_cmd_write() to save us from repeating this excercise in a few years when I forget again. Signed-off-by: Jan Kara <jack@suse.cz>	2016-02-18 14:03:03 +01:00
Jan Kara	044c9b6753	quota: Fix possible races during quota loading When loading new quota structure from disk, there is a possibility caller of dqget() will see uninitialized data due to CPU reordering loads or stores - loads from dquot can be reordered before test of DQ_ACTIVE_B bit or setting of this bit could be reordered before filling of the structure. Fix the issue by adding proper memory barriers. Signed-off-by: Jan Kara <jack@suse.cz>	2016-02-18 13:34:41 +01:00
Kinglong Mee	aa66b0bb08	btrfs: fix memory leak of fs_info in block group cache When starting up linux with btrfs filesystem, I got many memory leak messages by kmemleak as, unreferenced object 0xffff880066882000 (size 4096): comm "modprobe", pid 730, jiffies 4294690024 (age 196.599s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8174d52e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811d09aa>] kmem_cache_alloc_trace+0xea/0x1e0 [<ffffffffa03620fb>] btrfs_alloc_dummy_fs_info+0x6b/0x2a0 [btrfs] [<ffffffffa03624fc>] btrfs_alloc_dummy_block_group+0x5c/0x120 [btrfs] [<ffffffffa0360aa9>] btrfs_test_free_space_cache+0x39/0xed0 [btrfs] [<ffffffffa03b5a74>] trace_raw_output_xfs_attr_class+0x54/0xe0 [xfs] [<ffffffff81002122>] do_one_initcall+0xb2/0x1f0 [<ffffffff811765aa>] do_init_module+0x5e/0x1e9 [<ffffffff810fec09>] load_module+0x20a9/0x2690 [<ffffffff810ff439>] SyS_finit_module+0xb9/0xf0 [<ffffffff81757daf>] entry_SYSCALL_64_fastpath+0x12/0x76 [<ffffffffffffffff>] 0xffffffffffffffff unreferenced object 0xffff8800573f8000 (size 10256): comm "modprobe", pid 730, jiffies 4294690185 (age 196.460s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff8174d52e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8119ca6e>] kmalloc_order+0x5e/0x70 [<ffffffff8119caa4>] kmalloc_order_trace+0x24/0x90 [<ffffffffa03620b3>] btrfs_alloc_dummy_fs_info+0x23/0x2a0 [btrfs] [<ffffffffa03624fc>] btrfs_alloc_dummy_block_group+0x5c/0x120 [btrfs] [<ffffffffa036603d>] run_test+0xfd/0x320 [btrfs] [<ffffffffa0366f34>] btrfs_test_free_space_tree+0x94/0xee [btrfs] [<ffffffffa03b5aab>] trace_raw_output_xfs_attr_class+0x8b/0xe0 [xfs] [<ffffffff81002122>] do_one_initcall+0xb2/0x1f0 [<ffffffff811765aa>] do_init_module+0x5e/0x1e9 [<ffffffff810fec09>] load_module+0x20a9/0x2690 [<ffffffff810ff439>] SyS_finit_module+0xb9/0xf0 [<ffffffff81757daf>] entry_SYSCALL_64_fastpath+0x12/0x76 [<ffffffffffffffff>] 0xffffffffffffffff This patch lets btrfs using fs_info stored in btrfs_root for block group cache directly without allocating a new one. Fixes: `d0bd456074` ("Btrfs: add fragment=* debug mount option") Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 13:28:24 +01:00
Zhao Lei	4da2e26a2a	btrfs: Continue write in case of can_not_nocow btrfs failed in xfstests btrfs/080 with -o nodatacow. Can be reproduced by following script: DEV=/dev/vdg MNT=/mnt/tmp umount $DEV &>/dev/null mkfs.btrfs -f $DEV mount -o nodatacow $DEV $MNT dd if=/dev/zero of=$MNT/test bs=1 count=2048 & btrfs subvolume snapshot -r $MNT $MNT/test_snap & wait -- We can see dd failed on NO_SPACE. Reason: __btrfs_buffered_write should run cow write when no_cow impossible, and current code is designed with above logic. But check_can_nocow() have 2 type of return value(0 and <0) on can_not_no_cow, and current code only continue write on first case, the second case happened in doing subvolume. Fix: Continue write when check_can_nocow() return 0 and <0. Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>	2016-02-18 13:18:06 +01:00
Kinglong Mee	5598e9005a	btrfs: drop null testing before destroy functions Cleanup. kmem_cache_destroy has support NULL argument checking, so drop the double null testing before calling it. Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:46:03 +01:00
Sudip Mukherjee	89771cc98c	btrfs: fix build warning We were getting build warning about: fs/btrfs/extent-tree.c:7021:34: warning: ‘used_bg’ may be used uninitialized in this function It is not a valid warning as used_bg is never used uninitilized since locked is initially false so we can never be in the section where 'used_bg' is used. But gcc is not able to understand that and we can initialize it while declaring to silence the warning. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:46:03 +01:00
David Sterba	47dc196ae7	btrfs: use proper type for failrec in extent_state We use the private member of extent_state to store the failrec and play pointless pointer games. Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:46:03 +01:00
Deepa Dinamani	04b285f35e	btrfs: Replace CURRENT_TIME by current_fs_time() CURRENT_TIME macro is not appropriate for filesystems as it doesn't use the right granularity for filesystem timestamps. Use current_fs_time() instead. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Cc: Chris Mason <clm@fb.com> Cc: Josef Bacik <jbacik@fb.com> Cc: linux-btrfs@vger.kernel.org Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:46:03 +01:00
Dave Jones	8f682f6955	btrfs: remove open-coded swap() in backref.c:__merge_refs The kernel provides a swap() that does the same thing as this code. Signed-off-by: Dave Jones <dsj@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:45:55 +01:00
Byongho Lee	ac1407ba24	btrfs: remove redundant error check While running btrfs_mksubvol(), d_really_is_positive() is called twice. First in btrfs_mksubvol() and second inside btrfs_may_create(). So I remove the first one. Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:35:27 +01:00
Byongho Lee	0138b6fe8f	btrfs: simplify expression in btrfs_calc_trans_metadata_size() Simplify expression in btrfs_calc_trans_metadata_size(). Signed-off-by: Byongho Lee <bhlee.kernel@gmail.com> Reviewed-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:33:17 +01:00
Josef Bacik	baee879064	Btrfs: check reserved when deciding to background flush We will sometimes start background flushing the various enospc related things (delayed nodes, delalloc, etc) if we are getting close to reserving all of our available space. We don't want to do this however when we are actually using this space as it causes unneeded thrashing. We currently try to do this by checking bytes_used >= thresh, but bytes_used is only part of the equation, we need to use bytes_reserved as well as this represents space that is very likely to become bytes_used in the future. My tracing tool will keep count of the number of times we kick off the async flusher, the following are counts for the entire run of generic/027 No Patch Patch avg: 5385 5009 median: 5500 4916 We skewed lower than the average with my patch and higher than the average with the patch, overall it cuts the flushing from anywhere from 5-10%, which in the case of actual ENOSPC is quite helpful. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:29:43 +01:00
Josef Bacik	88d3a5aaf6	Btrfs: add transaction space reservation tracepoints There are a few places where we add to trans->bytes_reserved but don't have the corresponding trace point. With these added my tool no longer sees transaction leaks. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:22:41 +01:00
Josef Bacik	dc95f7bfc5	Btrfs: fix truncate_space_check truncate_space_check is using btrfs_csum_bytes_to_leaves() but forgetting to multiply by nodesize so we get an actual byte count. We need a tracepoint here so that we have the matching reserve for the release that will come later. Also add a comment to make clear what the intent of truncate_space_check is. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:22:24 +01:00
Josef Bacik	fb4b10e5d5	Btrfs: change how we update the global block rsv I'm writing a tool to visualize the enospc system in order to help debug enospc bugs and I found weird data and ran it down to when we update the global block rsv. We add all of the remaining free space to the block rsv, do a trace event, then remove the extra and do another trace event. This makes my visualization look silly and is unintuitive code as well. Fix this stuff to only add the amount we are missing, or free the amount we are missing. This is less clean to read but more explicit in what it is doing, as well as only emitting events for values that make sense. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 11:21:48 +01:00
Zhao Lei	7aff8cf4a6	btrfs: reada: ignore creating reada_extent for a non-existent device For a non-existent device, old code bypasses adding it in dev's reada queue. And to solve problem of unfinished waitting in raid5/6, commit `5fbc7c59fd` ("Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting") adding an exception for the first stripe, in short, the first stripe will always be processed whether the device exists or not. Actually we have a better way for the above request: just bypass creation of the reada_extent for non-existent device, it will make code simple and effective. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 10:27:23 +01:00
Zhao Lei	4fe7a0e138	btrfs: reada: avoid undone reada extents in btrfs_reada_wait Reada background works is not designed to finish all jobs completely, it will break in following case: 1: When a device reaches workload limit (MAX_IN_FLIGHT) 2: Total reads reach max limit (10000) 3: All devices don't have queued more jobs, often happened in DUP case And if all background works exit with remaining jobs, btrfs_reada_wait() will wait indefinetelly. Above problem is rarely happened in old code, because: 1: Every work queues 2x new works So many works reduced chances of undone jobs. 2: One work will continue 10000 times loop in case of no-jobs It reduced no-thread window time. But after we fixed above case, the "undone reada extents" frequently happened. Fix: Check to ensure we have at least one thread if there are undone jobs in btrfs_reada_wait(). Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>	2016-02-18 10:27:23 +01:00

... 78 79 80 81 82 ...

47524 Commits