android_kernel_xiaomi_sm8450

xiaomi-sm8450/android_kernel_xiaomi_sm8450

Author	SHA1	Message	Date
Paul E. McKenney	c7e88067c1	srcu: Exact tracking of srcu_data structures containing callbacks The current Tree SRCU implementation schedules a workqueue for every srcu_data covered by a given leaf srcu_node structure having callbacks, even if only one of those srcu_data structures actually contains callbacks. This is clearly inefficient for workloads that don't feature callbacks everywhere all the time. This commit therefore adds an array of masks that are used by the leaf srcu_node structures to track exactly which srcu_data structures contain callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Mike Galbraith <efault@gmx.de>	2017-04-26 11:23:12 -07:00
Ingo Molnar	58d30c36d4	Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Documentation updates. - Miscellaneous fixes. - Parallelize SRCU callback handling (plus overlapping patches). Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-04-23 11:12:44 +02:00
Paul E. McKenney	f2094107ac	Merge branches 'doc.2017.04.12a', 'fixes.2017.04.19a' and 'srcu.2017.04.21a' into HEAD doc.2017.04.12a: Documentation updates fixes.2017.04.19a: Miscellaneous fixes srcu.2017.04.21a: Parallelize SRCU callback handling	2017-04-21 06:00:13 -07:00
Paul E. McKenney	bcbfdd01dc	rcu: Make non-preemptive schedule be Tasks RCU quiescent state Currently, a call to schedule() acts as a Tasks RCU quiescent state only if a context switch actually takes place. However, just the call to schedule() guarantees that the calling task has moved off of whatever tracing trampoline that it might have been one previously. This commit therefore plumbs schedule()'s "preempt" parameter into rcu_note_context_switch(), which then records the Tasks RCU quiescent state, but only if this call to schedule() was -not- due to a preemption. To avoid adding overhead to the common-case context-switch path, this commit hides the rcu_note_context_switch() check under an existing non-common-case check. Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-21 05:59:27 -07:00
Paul E. McKenney	da915ad5cf	srcu: Parallelize callback handling Peter Zijlstra proposed using SRCU to reduce mmap_sem contention [1,2], however, there are workloads that could result in a high volume of concurrent invocations of call_srcu(), which with current SRCU would result in excessive lock contention on the srcu_struct structure's ->queue_lock, which protects SRCU's callback lists. This commit therefore moves SRCU to per-CPU callback lists, thus greatly reducing contention. Because a given SRCU instance no longer has a single centralized callback list, starting grace periods and invoking callbacks are both more complex than in the single-list Classic SRCU implementation. Starting grace periods and handling callbacks are now handled using an srcu_node tree that is in some ways similar to the rcu_node trees used by RCU-bh, RCU-preempt, and RCU-sched (for example, the srcu_node tree shape is controlled by exactly the same Kconfig options and boot parameters that control the shape of the rcu_node tree). In addition, the old per-CPU srcu_array structure is now named srcu_data and contains an rcu_segcblist structure named ->srcu_cblist for its callbacks (and a spinlock to protect this). The srcu_struct gets an srcu_gp_seq that is used to associate callback segments with the corresponding completion-time grace-period number. These completion-time grace-period numbers are propagated up the srcu_node tree so that the grace-period workqueue handler can determine whether additional grace periods are needed on the one hand and where to look for callbacks that are ready to be invoked. The srcu_barrier() function must now wait on all instances of the per-CPU ->srcu_cblist. Because each ->srcu_cblist is protected by ->lock, srcu_barrier() can remotely add the needed callbacks. In theory, it could also remotely start grace periods, but in practice doing so is complex and racy. And interestingly enough, it is never necessary for srcu_barrier() to start a grace period because srcu_barrier() only enqueues a callback when a callback is already present--and it turns out that a grace period has to have already been started for this pre-existing callback. Furthermore, it is only the callback that srcu_barrier() needs to wait on, not any particular grace period. Therefore, a new rcu_segcblist_entrain() function enqueues the srcu_barrier() function's callback into the same segment occupied by the last pre-existing callback in the list. The special case where all the pre-existing callbacks are on a different list (because they are in the process of being invoked) is handled by enqueuing srcu_barrier()'s callback into the RCU_DONE_TAIL segment, relying on the done-callbacks check that takes place after all callbacks are inovked. Note that the readers use the same algorithm as before. Note that there is a separate srcu_idx that tells the readers what counter to increment. This unfortunately cannot be combined with srcu_gp_seq because they need to be incremented at different times. This commit introduces some ugly #ifdefs in rcutorture. These will go away when I feel good enough about Tree SRCU to ditch Classic SRCU. Some crude performance comparisons, courtesy of a quickly hacked rcuperf asynchronous-grace-period capability: Callback Queuing Overhead ------------------------- # CPUS Classic SRCU Tree SRCU ------ ------------ --------- 2 0.349 us 0.342 us 16 31.66 us 0.4 us 41 --------- 0.417 us The times are the 90th percentiles, a statistic that was chosen to reject the overheads of the occasional srcu_barrier() call needed to avoid OOMing the test machine. The rcuperf test hangs when running Classic SRCU at 41 CPUs, hence the line of dashes. Despite the hacks to both the rcuperf code and that statistics, this is a convincing demonstration of Tree SRCU's performance and scalability advantages. [1] https://lwn.net/Articles/309030/ [2] https://patchwork.kernel.org/patch/5108281/ Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Fix initialization if synchronize_srcu_expedited() called first. ]	2017-04-21 05:59:26 -07:00
Paul E. McKenney	6ade8694f4	kvm: Move srcu_struct fields to end of struct kvm Parallelizing SRCU callback handling will increase the size of srcu_struct, which will move the kvm structure's kvm_arch field out of reach of powerpc's current assembly code, which will result in the following sort of build error: arch/powerpc/kvm/book3s_hv_rmhandlers.S:617: Error: operand out of range (0x000000000000b328 is not between 0xffffffffffff8000 and 0x0000000000007fff) This commit moves the srcu_struct fields in the kvm structure to follow the kvm_arch field, which will allow powerpc's assembly code to continue to be able to reach the kvm_arch field. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Michael Ellerman <michaele@au1.ibm.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Paolo Bonzini <pbonzini@redhat.com> [ paulmck: Moved this commit to precede SRCU callback parallelization, and reworded the commit log into future tense, all in the name of bisectability. ]	2017-04-21 05:56:31 -07:00
Michael S. Tsirkin	48ac34666f	hlist_add_tail_rcu disable sparse warning sparse is unhappy about this code in hlist_add_tail_rcu: struct hlist_node i, last = NULL; for (i = hlist_first_rcu(h); i; i = hlist_next_rcu(i)) last = i; This is because hlist_next_rcu and hlist_next_rcu return __rcu pointers. It's a false positive - it's a write side primitive and so does not need to be called in a read side critical section. The following trivial patch disables the warning without changing the behaviour in any way. Note: __hlist_for_each_rcu would also remove the warning but it would be confusing since it calls rcu_derefence and is designed to run in the rcu read side critical section. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-19 09:29:18 -07:00
Paul E. McKenney	468d01bec5	types: Update obsolete callback_head comment The comment header for callback_head (and thus for rcu_head) states that the bottom two bits of a pointer to these structures must be zero. This is obsolete: The new requirement is that only the bottom bit need be zero. This commit therefore updates this comment. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-19 09:29:17 -07:00
Paul E. McKenney	5f0d5a3ae7	mm: Rename SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU A group of Linux kernel hackers reported chasing a bug that resulted from their assumption that SLAB_DESTROY_BY_RCU provided an existence guarantee, that is, that no block from such a slab would be reallocated during an RCU read-side critical section. Of course, that is not the case. Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire slab of blocks. However, there is a phrase for this, namely "type safety". This commit therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order to avoid future instances of this sort of confusion. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <linux-mm@kvack.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> [ paulmck: Add comments mentioning the old name, as requested by Eric Dumazet, in order to help people familiar with the old name find the new one. ] Acked-by: David Rientjes <rientjes@google.com>	2017-04-18 11:42:36 -07:00
Paul E. McKenney	dad81a2026	srcu: Introduce CLASSIC_SRCU Kconfig option The TREE_SRCU rewrite is large and a bit on the non-simple side, so this commit helps reduce risk by allowing the old v4.11 SRCU algorithm to be selected using a new CLASSIC_SRCU Kconfig option that depends on RCU_EXPERT. The default is to use the new TREE_SRCU and TINY_SRCU algorithms, in order to help get these the testing that they need. However, if your users do not require the update-side scalability that is to be provided by TREE_SRCU, select RCU_EXPERT and then CLASSIC_SRCU to revert back to the old classic SRCU algorithm. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:23 -07:00
Paul E. McKenney	d8be81735a	srcu: Create a tiny SRCU In response to automated complaints about modifications to SRCU increasing its size, this commit creates a tiny SRCU that is used in SMP=n && PREEMPT=n builds. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:22 -07:00
Paul E. McKenney	f60d231a87	srcu: Crude control of expedited grace periods SRCU's implementation of expedited grace periods has always assumed that the SRCU instance is idle when the expedited request arrives. This commit improves this a bit by maintaining a count of the number of outstanding expedited requests, thus allowing prior non-expedited grace periods accommodate these requests by shifting to expedited mode. However, any non-expedited wait already in progress will still wait for the full duration. Improved control of expedited grace periods is planned, but one step at a time. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:22 -07:00
Paul E. McKenney	80a7956fe3	srcu: Merge ->srcu_state into ->srcu_gp_seq Updating ->srcu_state and ->srcu_gp_seq will lead to extremely complex race conditions given multiple callback queues, so this commit takes advantage of the two-bit state now available in rcu_seq counters to store the state in the bottom two bits of ->srcu_gp_seq. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:22 -07:00
Paul E. McKenney	2b34c43cc1	srcu: Move rcu_init_levelspread() to rcu_tree_node.h This commit moves the rcu_init_levelspread() function from kernel/rcu/tree.c to kernel/rcu/rcu.h so that SRCU can access it. This is another step towards enabling SRCU to create its own combining tree. This commit is code-movement only, give or take knock-on adjustments. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	f2425b4efb	srcu: Move combining-tree definitions for SRCU's benefit This commit moves the C preprocessor code that defines the default shape of the rcu_node combining tree to a new include/linux/rcu_node_tree.h file as a first step towards enabling SRCU to create its own combining tree, which in turn enables SRCU to implement per-CPU callback handling, thus avoiding contention on the lock currently guarding the single list of callbacks. Note that users of SRCU still need to know the size of the srcu_struct structure, hence include/linux rather than kernel/rcu. This commit is code-movement only. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	8660b7d8a5	srcu: Use rcu_segcblist to track SRCU callbacks This commit switches SRCU from custom-built callback queues to the new rcu_segcblist structure. This change associates grace-period sequence numbers with groups of callbacks, which will be needed for efficient processing of per-CPU callbacks. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	ac367c1c62	srcu: Add grace-period sequence numbers This commit adds grace-period sequence numbers, which will be used to handle mid-boot grace periods and per-CPU callback lists. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	c2a8ec0778	srcu: Move to state-based grace-period sequencing The current SRCU grace-period processing might never reach the last portion of srcu_advance_batches(). This is OK given the current implementation, as the first portion, up to the try_check_zero() following the srcu_flip() is sufficient to drive grace periods forward. However, it has the unfortunate side-effect of making it impossible to determine when a given grace period has ended, and it will be necessary to efficiently trace ends of grace periods in order to efficiently handle per-CPU SRCU callback lists. This commit therefore adds states to the SRCU grace-period processing, so that the end of a given SRCU grace period is marked by the transition to the SRCU_STATE_DONE state. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:20 -07:00
Paul E. McKenney	900b1028ec	srcu: Allow SRCU to access rcu_scheduler_active This is primarily a code-movement commit in preparation for allowing SRCU to handle early-boot SRCU grace periods. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2017-04-18 11:38:18 -07:00
Paul E. McKenney	77e5849688	rcu: Make arch select smp_mb__after_unlock_lock() strength The definition of smp_mb__after_unlock_lock() is currently smp_mb() for CONFIG_PPC and a no-op otherwise. It would be better to instead provide an architecture-selectable Kconfig option, and select the strength of smp_mb__after_unlock_lock() based on that option. This commit therefore creates ARCH_WEAK_RELEASE_ACQUIRE, has PPC select it, and bases the definition of smp_mb__after_unlock_lock() on this new ARCH_WEAK_RELEASE_ACQUIRE Kconfig option. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Boqun Feng <boqun.feng@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <linuxppc-dev@lists.ozlabs.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2017-04-18 11:20:15 -07:00
Paul E. McKenney	b8c17e6664	rcu: Maintain special bits at bottom of ->dynticks counter Currently, IPIs are used to force other CPUs to invalidate their TLBs in response to a kernel virtual-memory mapping change. This works, but degrades both battery lifetime (for idle CPUs) and real-time response (for nohz_full CPUs), and in addition results in unnecessary IPIs due to the fact that CPUs executing in usermode are unaffected by stale kernel mappings. It would be better to cause a CPU executing in usermode to wait until it is entering kernel mode to do the flush, first to avoid interrupting usemode tasks and second to handle multiple flush requests with a single flush in the case of a long-running user task. This commit therefore reserves a bit at the bottom of the ->dynticks counter, which is checked upon exit from extended quiescent states. If it is set, it is cleared and then a new rcu_eqs_special_exit() macro is invoked, which, if not supplied, is an empty single-pass do-while loop. If this bottom bit is set on -entry- to an extended quiescent state, then a WARN_ON_ONCE() triggers. This bottom bit may be set using a new rcu_eqs_special_set() function, which returns true if the bit was set, or false if the CPU turned out to not be in an extended quiescent state. Please note that this function refuses to set the bit for a non-nohz_full CPU when that CPU is executing in usermode because usermode execution is tracked by RCU as a dyntick-idle extended quiescent state only for nohz_full CPUs. Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2017-04-18 11:19:22 -07:00
Heiner Kallweit	5ef1ecf060	mmc: sdio: fix alignment issue in struct sdio_func Certain 64-bit systems (e.g. Amlogic Meson GX) require buffers to be used for DMA to be 8-byte-aligned. struct sdio_func has an embedded small DMA buffer not meeting this requirement. When testing switching to descriptor chain mode in meson-gx driver SDIO is broken therefore. Fix this by allocating the small DMA buffer separately as kmalloc ensures that the returned memory area is properly aligned for every basic data type. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Tested-by: Helmut Klein <hgkr.klein@gmail.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2017-04-18 19:18:07 +02:00
Linus Torvalds	7395ca0f91	Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Again, a batch that's been sitting a couple of weeks, mostly because I anticipated a bit more material but it didn't show up -- which is good. These are all your garden variety fixes for ARM platforms. The most visible issue fixed here is probably the SMP reset issue on OMAP, the rest are minor stuff" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: arm64: allwinner: a64: add pmu0 regs for USB PHY ARM: OMAP2+: omap_device: Sync omap_device and pm_runtime after probe defer reset: add exported __reset_control_get, return NULL if optional ARM: orion5x: only call into phylib when available ARM: omap2+: Revert omap-smp.c changes resetting CPU1 during boot ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend ARM: dts: ti: fix PCI bus dtc warnings ARM: dts: am335x-baltos: disable EEE for Atheros 8035 PHY ARM: dts: OMAP3: Fix MFG ID EEPROM ARM: sun8i: a33: add operating-points-v2 property to all nodes ARM: sun8i: a33: remove highest OPP to fix CPU crashes	2017-04-16 12:38:17 -07:00
Linus Torvalds	a86f106f48	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Four small fixes. Three of them fix the same error in NVMe, in loop, fc, and rdma respectively. The last fix from Ming fixes a regression in this series, where our bvec gap logic was wrong and causes an oops on NVMe for certain conditions" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix bio_will_gap() for first bvec with offset nvme-fc: Fix sqsize wrong assignment based on ctrl MQES capability nvme-rdma: Fix sqsize wrong assignment based on ctrl MQES capability nvme-loop: Fix sqsize wrong assignment based on ctrl MQES capability	2017-04-16 12:05:09 -07:00
Linus Torvalds	7e703eccf0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Things seem to be settling down as far as networking is concerned, let's hope this trend continues... 1) Add iov_iter_revert() and use it to fix the behavior of skb_copy_datagram_msg() et al., from Al Viro. 2) Fix the protocol used in the synthetic SKB we cons up for the purposes of doing a simulated route lookup for RTM_GETROUTE requests. From Florian Larysch. 3) Don't add noop_qdisc to the per-device qdisc hashes, from Cong Wang. 4) Don't call netdev_change_features with the team lock held, from Xin Long. 5) Revert TCP F-RTO extension to catch more spurious timeouts because it interacts very badly with some middle-boxes. From Yuchung Cheng. 6) Fix the loss of error values in l2tp {s,g}etsockopt calls, from Guillaume Nault. 7) ctnetlink uses bit positions where it should be using bit masks, fix from Liping Zhang. 8) Missing RCU locking in netfilter helper code, from Gao Feng. 9) Avoid double frees and use-after-frees in tcp_disconnect(), from Eric Dumazet. 10) Don't do a changelink before we register the netdevice in bridging, from Ido Schimmel. 11) Lock the ipv6 device address list properly, from Rabin Vincent" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits) netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage netfilter: nft_hash: do not dump the auto generated seed drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201 ipv6: Fix idev->addr_list corruption net: xdp: don't export dev_change_xdp_fd() bridge: netlink: register netdevice before executing changelink bridge: implement missing ndo_uninit() bpf: reference may_access_skb() from __bpf_prog_run() tcp: clear saved_syn in tcp_disconnect() netfilter: nf_ct_expect: use proper RCU list traversal/update APIs netfilter: ctnetlink: skip dumping expect when nfct_help(ct) is NULL netfilter: make it safer during the inet6_dev->addr_list traversal netfilter: ctnetlink: make it safer when checking the ct helper name netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_find netfilter: ctnetlink: using bit to represent the ct event netfilter: xt_TCPMSS: add more sanity tests on tcph->doff net: tcp: Increase TCP_MIB_OUTRSTS even though fail to alloc skb l2tp: don't mask errors in pppol2tp_getsockopt() l2tp: don't mask errors in pppol2tp_setsockopt() tcp: restrict F-RTO to work-around broken middle-boxes ...	2017-04-14 17:38:24 -07:00
Ming Lei	5a8d75a1b8	block: fix bio_will_gap() for first bvec with offset Commit 729204ef49ec("block: relax check on sg gap") allows us to merge bios, if both are physically contiguous. This change can merge a huge number of small bios, through mkfs for example, mkfs.ntfs running time can be decreased to ~1/10. But if one rq starts with a non-aligned buffer (the 1st bvec's bv_offset is non-zero) and if we allow the merge, it is quite difficult to respect sg gap limit, especially the max segment size, or we risk having an unaligned virtual boundary. This patch tries to avoid the issue by disallowing a merge, if the req starts with an unaligned buffer. Also add comments to explain why the merged segment can't end in unaligned virt boundary. Fixes: `729204ef49` ("block: relax check on sg gap") Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Rewrote parts of the commit message and comments. Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-14 13:58:29 -06:00
Linus Torvalds	7873933385	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio fixes from Michael S. Tsirkin: "virtio oops fixes The virtio pci rework using shared interrupts caused a lot of issues. We tried to fix them but run out of time. Revert for now, and revisit the issue for the next kernel. Luckily we are able to do this without loosing automatic interrupt NUMA affinity which was the main motivator for the rework" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio-pci: Remove affinity hint before freeing the interrupt Revert "virtio_pci: remove struct virtio_pci_vq_info" Revert "virtio_pci: use shared interrupts for virtqueues" Revert "virtio_pci: don't duplicate the msix_enable flag in struct pci_dev" Revert "virtio_pci: simplify MSI-X setup" Revert "virtio_pci: fix out of bound access for msix_names" MAINTAINERS: fix virtio file pattern virtio_console: fix uninitialized variable use virtio_net: clear MTU when out of range virtio: allow drivers to validate features virtio_net: enable big packets for large MTU values	2017-04-14 08:49:39 -07:00
Kirill A. Shutemov	c0c379e293	mm: drop unused pmdp_huge_get_and_clear_notify() Dave noticed that after fixing MADV_DONTNEED vs numa balancing race the last pmdp_huge_get_and_clear_notify() user is gone. Let's drop the helper. Link: http://lkml.kernel.org/r/20170306112047.24809-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-13 18:24:21 -07:00
Linus Torvalds	06ea4c38bc	Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "This contains fixes for two long standing subtle bugs: - kthread_bind() on a new kthread binds it to specific CPUs and prevents userland from messing with the affinity or cgroup membership. Unfortunately, for cgroup membership, there's a window between kthread creation and kthread_bind() invocation where the kthread can be moved into a non-root cgroup by userland. Depending on what controllers are in effect, this can assign the kthread unexpected attributes. For example, in the reported case, workqueue workers ended up in a non-root cpuset cgroups and had their CPU affinities overridden. This broke workqueue invariants and led to workqueue stalls. Fixed by closing the window between kthread creation and kthread_bind() as suggested by Oleg. - There was a bug in cgroup mount path which could allow two competing mount attempts to attach the same cgroup_root to two different superblocks. This was caused by mishandling return value from kernfs_pin_sb(). Fixed" 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: avoid attaching a cgroup root to two different superblocks cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups	2017-04-11 23:38:16 -07:00
Linus Torvalds	2a610b8aa8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull VFS fixes from Al Viro: "statx followup fixes and a fix for stack-smashing on alpha" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: alpha: fix stack smashing in old_adjtimex(2) statx: Include a mask for stx_attributes in struct statx statx: Reserve the top bit of the mask for future struct expansion xfs: report crtime and attribute flags to statx ext4: Add statx support statx: optimize copy of struct statx to userspace statx: remove incorrect part of vfs_statx() comment statx: reject unknown flags when using NULL path Documentation/filesystems: fix documentation for ->getattr()	2017-04-09 08:26:21 -07:00
Linus Torvalds	78d91a75b4	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "Here's a pull request for 4.11-rc, fixing a set of issues mostly centered around the new scheduling framework. These have been brewing for a while, but split up into what we absolutely need in 4.11, and what we can defer until 4.12. These are well tested, on both single queue and multiqueue setups, and with and without shared tags. They fix several hangs that have happened in testing. This is obviously larger than I would have preferred at this point in time, but I don't think we can shave much off this and still get the desired results. In detail, this pull request contains: - a set of five fixes for NVMe, mostly from Christoph and one from Roland. - a series from Bart, fixing issues with dm-mq and SCSI shared tags and scheduling. Note that one of those patches commit messages may read like an optimization, but it is in fact an important fix for queue restarts in particular. - a series from Omar, most importantly fixing a hang with multiple hardware queues when we fail to get a driver tag. Another important fix in there is for resizing hardware queues, which nbd does when handling multiple sockets for one connection. - fixing an imbalance in putting the ctx for hctx request allocations from Minchan" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: Restart a single queue if tag sets are shared dm rq: Avoid that request processing stalls sporadically scsi: Avoid that SCSI queues get stuck blk-mq: Introduce blk_mq_delay_run_hw_queue() blk-mq: remap queues when adding/removing hardware queues blk-mq-sched: fix crash in switch error path blk-mq-sched: set up scheduler tags when bringing up new queues blk-mq-sched: refactor scheduler initialization blk-mq: use the right hctx when getting a driver tag fails nvmet: fix byte swap in nvmet_parse_io_cmd nvmet: fix byte swap in nvmet_execute_write_zeroes nvmet: add missing byte swap in nvmet_get_smart_log nvme: add missing byte swap in nvme_setup_discard nvme: Correct NVMF enum values to match NVMe-oF rev 1.0 block: do not put mq context in blk_mq_alloc_request_hctx	2017-04-08 11:56:58 -07:00
Linus Torvalds	c3df1c7c36	Merge tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fix from Linus Walleij: "This late fix for pin control is hopefully the last I send this cycle. The problem was detected early in the v4.11 release cycle and there has been some back and forth on how to solve it. Sadly the proper fix arrives late, but at least not too late. An issue was detected with pin control on the Freescale i.MX after the refactorings for more general group and function handling. We now have the proper fix for this" * tag 'pinctrl-v4.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable()	2017-04-08 11:43:38 -07:00
Linus Torvalds	542380a208	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Radim Krčmář: "ARM: - Fix a problem with GICv3 userspace save/restore - Clarify GICv2 userspace save/restore ABI - Be more careful in clearing GIC LRs - Add missing synchronization primitive to our MMU handling code PPC: - Check for a NULL return from kzalloc s390: - Prevent translation exception errors on valid page tables for the instruction-exection-protection support x86: - Fix Page-Modification Logging when running a nested guest" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Check for kmalloc errors in ioctl KVM: nVMX: initialize PML fields in vmcs02 KVM: nVMX: do not leak PML full vmexit to L1 KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI KVM: arm64: Ensure LRs are clear when they should be kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd KVM: s390: remove change-recording override support arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm	2017-04-08 01:39:43 -07:00
Olof Johansson	d4ee21ef61	Merge tag 'reset-fixes-for-4.11-2' of git://git.pengutronix.de/git/pza/linux into fixes Reset controller fixes for v4.11 Fix devm_reset_controller_get_optional to return NULL for non-DT devices, if the RESET_CONTROLLER Kconfig option is enabled. This fixes probe failures of the 8250_dw driver on Intel platforms after commit `acbdad8dd1` ("serial: 8250_dw: simplify optional reset handling"). * tag 'reset-fixes-for-4.11-2' of git://git.pengutronix.de/git/pza/linux: reset: add exported __reset_control_get, return NULL if optional Signed-off-by: Olof Johansson <olof@lixom.net>	2017-04-07 16:49:08 -07:00
Bart Van Assche	6d8c6c0f97	blk-mq: Restart a single queue if tag sets are shared To improve scalability, if hardware queues are shared, restart a single hardware queue in round-robin fashion. Rename blk_mq_sched_restart_queues() to reflect the new semantics. Remove blk_mq_sched_mark_restart_queue() because this function has no callers. Remove flag QUEUE_FLAG_RESTART because this patch removes the code that uses this flag. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 12:40:09 -06:00
Bart Van Assche	7587a5ae7e	blk-mq: Introduce blk_mq_delay_run_hw_queue() Introduce a function that runs a hardware queue unconditionally after a delay. Note: there is already a function that stops and restarts a hardware queue after a delay, namely blk_mq_delay_queue(). This function will be used in the next patch in this series. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Long Li <longli@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 12:27:06 -06:00
Omar Sandoval	54d5329d42	blk-mq-sched: fix crash in switch error path In elevator_switch(), if blk_mq_init_sched() fails, we attempt to fall back to the original scheduler. However, at this point, we've already torn down the original scheduler's tags, so this causes a crash. Doing the fallback like the legacy elevator path is much harder for mq, so fix it by just falling back to none, instead. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-07 08:56:48 -06:00
Michael S. Tsirkin	404123c2db	virtio: allow drivers to validate features Some drivers can't support all features in all configurations. At the moment we blindly set FEATURES_OK and later FAILED. Support this better by adding a callback drivers can use to do some early checks. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-04-07 16:38:59 +03:00
Tony Lindgren	6118714275	pinctrl: core: Fix pinctrl_register_and_init() with pinctrl_enable() Recent pinctrl changes to allow dynamic allocation of pins exposed one more issue with the pinctrl pins claimed early by the controller itself. This caused a regression for IMX6 pinctrl hogs. Before enabling the pin controller driver we need to wait until it has been properly initialized, then claim the hogs, and only then enable it. To fix the regression, split the code into pinctrl_claim_hogs() and pinctrl_enable(). And then let's require that pinctrl_enable() is always called by the pin controller driver when ready after calling pinctrl_register_and_init(). Depends-on: `950b0d91dc` ("pinctrl: core: Fix regression caused by delayed work for hogs") Fixes: df61b366af26 ("pinctrl: core: Use delayed work for hogs") Fixes: `e566fc11ea` ("pinctrl: imx: use generic pinctrl helpers for managing groups") Cc: Haojian Zhuang <haojian.zhuang@linaro.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Mika Penttilä <mika.penttila@nextfour.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Nishanth Menon <nm@ti.com> Cc: Shawn Guo <shawnguo@kernel.org> Cc: Stefan Agner <stefan@agner.ch> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Tested-by: Gary Bisson <gary.bisson@boundarydevices.com> Tested-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2017-04-07 01:08:08 +02:00
David S. Miller	0e4c0ee580	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2017-04-06 11:57:04 -07:00
Linus Torvalds	aeb4a57681	Merge tag 'mfd-fixes-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD bug fix from Lee Jones: "Increase buffer size om cros-ec to allow for SPI messages" * tag 'mfd-fixes-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: cros-ec: Fix host command buffer size	2017-04-05 09:04:26 -07:00
Radim Krčmář	6fd6410311	Merge tag 'kvm-arm-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm From: Christoffer Dall <cdall@linaro.org> KVM/ARM Fixes for v4.11-rc6 Fixes include: - Fix a problem with GICv3 userspace save/restore - Clarify GICv2 userspace save/restore ABI - Be more careful in clearing GIC LRs - Add missing synchronization primitive to our MMU handling code	2017-04-05 16:27:47 +02:00
Vic Yang	b2376407f9	mfd: cros-ec: Fix host command buffer size For SPI, we can get up to 32 additional bytes for response preamble. The current overhead (2 bytes) may cause problems when we try to receive a big response. Update it to 32 bytes. Without this fix we could see a kernel BUG when we receive a big response from the Chrome EC when is connected via SPI. Signed-off-by: Vic Yang <victoryang@google.com> Tested-by: Enric Balletbo i Serra <enric.balletbo.collabora.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>	2017-04-05 13:30:07 +01:00
Philipp Zabel	62e24c5775	reset: add exported __reset_control_get, return NULL if optional Rename the internal __reset_control_get/put functions to __reset_control_get/put_internal and add an exported __reset_control_get equivalent to __of_reset_control_get that takes a struct device parameter. This avoids the confusing call to __of_reset_control_get in the non-DT case and fixes the devm_reset_control_get_optional function to return NULL if RESET_CONTROLLER is enabled but dev->of_node == NULL. Fixes: `bb475230b8` ("reset: make optional functions really optional") Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Ramiro Oliveira <Ramiro.Oliveira@synopsys.com> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>	2017-04-04 17:36:10 +02:00
Christoffer Dall	6d56111c92	KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI As an oversight, for GICv2, we accidentally export the GICC_PMR register in the format of the GICH_VMCR.VMPriMask field in the lower 5 bits of a word, meaning that userspace must always use the lower 5 bits to communicate with the KVM device and must shift the value left by 3 places to obtain the actual priority mask level. Since GICv3 supports the full 8 bits of priority masking in the ICH_VMCR, we have to fix the value we export when emulating a GICv2 on top of a hardware GICv3 and exporting the emulated GICv2 state to userspace. Take the chance to clarify this aspect of the ABI. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org>	2017-04-04 14:33:59 +02:00
David Howells	3209f68b3c	statx: Include a mask for stx_attributes in struct statx Include a mask in struct stat to indicate which bits of stx_attributes the filesystem actually supports. This would also be useful if we add another system call that allows you to do a 'bulk attribute set' and pass in a statx struct with the masks appropriately set to say what you want to set. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-03 01:06:00 -04:00
Linus Torvalds	128c434a70	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "This update provides: - make the scheduler clock switch to unstable mode smooth so the timestamps stay at microseconds granularity instead of switching to tick granularity. - unbreak perf test tsc by taking the new offset into account which was added in order to proveide better sched clock continuity - switching sched clock to unstable mode runs all clock related computations which affect the sched clock output itself from a work queue. In case of preemption sched clock uses half updated data and provides wrong timestamps. Keep the math in the protected context and delegate only the static key switch to workqueue context. - remove a duplicate header include" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/headers: Remove duplicate #include <linux/sched/debug.h> line sched/clock: Fix broken stable to unstable transfer sched/clock, x86/perf: Fix "perf test tsc" sched/clock: Fix clear_sched_clock_stable() preempt wobbly	2017-04-02 09:25:10 -07:00
Linus Torvalds	4a6808f347	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Two small fixes for the new CLKEVT_OF infrastructure" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: vmlinux.lds: Add __clkevt_of_table to kernel clockevents: Fix syntax error in clkevt-of macro	2017-04-02 09:22:03 -07:00
Al Viro	27c0e3748e	[iov_iter] new privimitive: iov_iter_revert() opposite to iov_iter_advance(); the caller is responsible for never using it to move back past the initial position. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-02 12:10:47 -04:00
Roland Dreier	bf17aa36c0	nvme: Correct NVMF enum values to match NVMe-oF rev 1.0 The enum values for QPTYPE, PRTYPE and CMS are off by 1 from the values defined in figure 42 of the NVM Express over Fabrics 1.0: http://www.nvmexpress.org/wp-content/uploads/NVMe_over_Fabrics_1_0_Gold_20160605-1.pdf Fix our enums to match the final spec. Signed-off-by: Roland Dreier <roland@purestorage.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>	2017-04-02 10:24:07 +03:00

1 2 3 4 5 ...

54662 Commits