795ae7a0de6b834a0cc202aa55c190ef81496665
578123 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
795ae7a0de |
mm: scale kswapd watermarks in proportion to memory
In machines with 140G of memory and enterprise flash storage, we have seen read and write bursts routinely exceed the kswapd watermarks and cause thundering herds in direct reclaim. Unfortunately, the only way to tune kswapd aggressiveness is through adjusting min_free_kbytes - the system's emergency reserves - which is entirely unrelated to the system's latency requirements. In order to get kswapd to maintain a 250M buffer of free memory, the emergency reserves need to be set to 1G. That is a lot of memory wasted for no good reason. On the other hand, it's reasonable to assume that allocation bursts and overall allocation concurrency scale with memory capacity, so it makes sense to make kswapd aggressiveness a function of that as well. Change the kswapd watermark scale factor from the currently fixed 25% of the tunable emergency reserve to a tunable 0.1% of memory. Beyond 1G of memory, this will produce bigger watermark steps than the current formula in default settings. Ensure that the new formula never chooses steps smaller than that, i.e. 25% of the emergency reserve. On a 140G machine, this raises the default watermark steps - the distance between min and low, and low and high - from 16M to 143M. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
3ed3a4f0dd |
mm: cleanup *pte_alloc* interfaces
There are few things about *pte_alloc*() helpers worth cleaning up: - 'vma' argument is unused, let's drop it; - most __pte_alloc() callers do speculative check for pmd_none(), before taking ptl: let's introduce pte_alloc() macro which does the check. The only direct user of __pte_alloc left is userfaultfd, which has different expectation about atomicity wrt pmd. - pte_alloc_map() and pte_alloc_map_lock() are redefined using pte_alloc(). [sudeep.holla@arm.com: fix build for arm64 hugetlbpage] [sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
5057dcd0f1 |
virtio_balloon: export 'available' memory to balloon statistics
Add a new field, VIRTIO_BALLOON_S_AVAIL, to virtio_balloon memory statistics protocol, corresponding to 'Available' in /proc/meminfo. It indicates to the hypervisor how big the balloon can be inflated without pushing the guest system to swap. Signed-off-by: Igor Redko <redkoi@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
d02bd27bd3 |
mm/page_alloc.c: calculate 'available' memory in a separate function
Add a new field, VIRTIO_BALLOON_S_AVAIL, to virtio_balloon memory statistics protocol, corresponding to 'Available' in /proc/meminfo. It indicates to the hypervisor how big the balloon can be inflated without pushing the guest system to swap. This metric would be very useful in VM orchestration software to improve memory management of different VMs under overcommit. This patch (of 2): Factor out calculation of the available memory counter into a separate exportable function, in order to be able to use it in other parts of the kernel. In particular, it appears a relevant metric to report to the hypervisor via virtio-balloon statistics interface (in a followup patch). Signed-off-by: Igor Redko <redkoi@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
7eb50292d7 |
mm/Kconfig: remove redundant arch depend for memory hotplug
MEMORY_HOTPLUG already depends on ARCH_ENABLE_MEMORY_HOTPLUG which is selected by the supported architectures, so the following arch depend is unnecessary. Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
01609ec2fa |
ARC, thp: remove infrastructure for handling splitting PMDs
With THP refcounting work, no need to mark PMDs splitting. (ARC got missed under the sweeping arch change as THP support was likely not present in orig baseline) Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
458aa76d13 |
mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_range
We remove one instace of flush_tlb_range here. That was added by commit
|
||
|
|
bcf6691797 |
mm, tracing: refresh __def_vmaflag_names
Get list of VMA flags up-to-date and sort it to match VM_* definition order. [vbabka@suse.cz: add a note above vmaflag definitions to update the names when changing] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
39a1aa8e19 |
mm: deduplicate memory overcommitment code
Currently we have two copies of the same code which implements memory overcommitment logic. Let's move it into mm/util.c and hence avoid duplication. No functional changes here. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
ea606cf5d8 |
mm: move max_map_count bits into mm.h
max_map_count sysctl unrelated to scheduler. Move its bits from include/linux/sched/sysctl.h to include/linux/mm.h. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
f9719a03de |
thp, vmstats: count deferred split events
Count how many times we put a THP in split queue. Currently, it happens on partial unmap of a THP. Rapidly growing value can indicate that an application behaves unfriendly wrt THP: often fault in huge page and then unmap part of it. This leads to unnecessary memory fragmentation and the application may require tuning. The event also can help with debugging kernel [mis-]behaviour. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
0a6b76dd23 |
mm: workingset: make shadow node shrinker memcg aware
Workingset code was recently made memcg aware, but shadow node shrinker is still global. As a result, one small cgroup can consume all memory available for shadow nodes, possibly hurting other cgroups by reclaiming their shadow nodes, even though reclaim distances stored in its shadow nodes have no effect. To avoid this, we need to make shadow node shrinker memcg aware. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
cdcbb72ebf |
mm: workingset: size shadow nodes lru basing on file cache size
A page is activated on refault if the refault distance stored in the corresponding shadow entry is less than the number of active file pages. Since active file pages can't occupy more than half memory, we assume that the maximal effective refault distance can't be greater than half the number of present pages and size the shadow nodes lru list appropriately. Generally speaking, this assumption is correct, but it can result in wasting a considerable chunk of memory on stale shadow nodes in case the portion of file pages is small, e.g. if a workload mostly uses anonymous memory. To sort this out, we need to compute the size of shadow nodes lru basing not on the maximal possible, but the current size of file cache. We could take the size of active file lru for the maximal refault distance, but active lru is pretty unstable - it can shrink dramatically at runtime possibly disrupting workingset detection logic. Instead we assume that the maximal refault distance equals half the total number of file cache pages. This will protect us against active file lru size fluctuations while still being correct, because size of active lru is normally maintained lower than size of inactive lru. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
58e698af4c |
radix-tree: account radix_tree_node to memory cgroup
Allocation of radix_tree_node objects can be easily triggered from userspace, so we should account them to memory cgroup. Besides, we need them accounted for making shadow node shrinker per memcg (see mm/workingset.c). A tricky thing about accounting radix_tree_node objects is that they are mostly allocated through radix_tree_preload(), so we can't just set SLAB_ACCOUNT for radix_tree_node_cachep - that would likely result in a lot of unrelated cgroups using objects from each other's caches. One way to overcome this would be making radix tree preloads per memcg, but that would probably look cumbersome and overcomplicated. Instead, we make radix_tree_node_alloc() first try to allocate from the cache with __GFP_ACCOUNT, no matter if the caller has preloaded or not, and only if it fails fall back on using per cpu preloads. This should make most allocations accounted. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
b6ecd2dea4 |
mm: memcontrol: zap memcg_kmem_online helper
As kmem accounting is now either enabled for all cgroups or disabled system-wide, there's no point in having memcg_kmem_online() helper - instead one can use memcg_kmem_enabled() and mem_cgroup_online(), as shrink_slab() now does. There are only two places left where this helper is used - __memcg_kmem_charge() and memcg_create_kmem_cache(). The former can only be called if memcg_kmem_enabled() returned true. Since the cgroup it operates on is online, mem_cgroup_is_root() check will be enough. memcg_create_kmem_cache() can't use mem_cgroup_online() helper instead of memcg_kmem_online(), because it relies on the fact that in memcg_offline_kmem() memcg->kmem_state is changed before memcg_deactivate_kmem_caches() is called, but there we can just open-code the check. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
0fc9f58a90 |
mm: vmscan: pass root_mem_cgroup instead of NULL to memcg aware shrinker
It's just convenient to implement a memcg aware shrinker when you know that shrink_control->memcg != NULL unless memcg_kmem_enabled() returns false. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
b313aeee25 |
mm: memcontrol: enable kmem accounting for all cgroups in the legacy hierarchy
Workingset code was recently made memcg aware, but shadow node shrinker is still global. As a result, one small cgroup can consume all memory available for shadow nodes, possibly hurting other cgroups by reclaiming their shadow nodes, even though reclaim distances stored in its shadow nodes have no effect. To avoid this, we need to make shadow node shrinker memcg aware. The actual work is done in patch 6 of the series. Patches 1 and 2 prepare memcg/shrinker infrastructure for the change. Patch 3 is just a collateral cleanup. Patch 4 makes radix_tree_node accounted, which is necessary for making shadow node shrinker memcg aware. Patch 5 reduces shadow nodes overhead in case workload mostly uses anonymous pages. This patch: Currently, in the legacy hierarchy kmem accounting is off for all cgroups by default and must be enabled explicitly by writing something to memory.kmem.limit_in_bytes. Since we don't support reclaim on hitting kmem limit, nor do we have any plans to implement it, this is likely to be -1, just to enable kmem accounting and limit kernel memory consumption by the memory.limit_in_bytes along with user memory. This user API was introduced when the implementation of kmem accounting lacked slab shrinker support and hence was useless in practice. Things have changed since then - slab shrinkers were made memcg aware, the accounting overhead seems to be negligible, and a failure to charge a kmem allocation should not have critical consequences, because we only account those kernel objects that should be safe to fail. That's why kmem accounting is enabled by default for all cgroups in the default hierarchy, which will eventually replace the legacy one. The ability to enable kmem accounting for some cgroups while keeping it disabled for others is getting difficult to maintain. E.g. to make shadow node shrinker memcg aware (see mm/workingset.c), we need to know the relationship between the number of shadow nodes allocated for a cgroup and the size of its lru list. If kmem accounting is enabled for all cgroups there is no problem, but what should we do if kmem accounting is enabled only for half of cgroups? We've no other choice but use global lru stats while scanning root cgroup's shadow nodes, but that would be wrong if kmem accounting was enabled for all cgroups (which is the case if the unified hierarchy is used), in which case we should use lru stats of the root cgroup's lruvec. That being said, let's enable kmem accounting for all memory cgroups by default. If one finds it unstable or too costly, it can always be disabled system-wide by passing cgroup.memory=nokmem to the kernel at boot time. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
4b0f326163 |
include/linux/page-flags.h: force inlining of selected page flag modifications
Sometimes gcc mysteriously doesn't inline
very small functions we expect to be inlined. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
With this .config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os,
the following functions get deinlined many times.
Examples of disassembly:
<SetPageUptodate> (43 copies, 141 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 08 lock orb $0x8,(%rdi)
5d pop %rbp
c3 retq
<PagePrivate> (10 copies, 134 calls):
48 8b 07 mov (%rdi),%rax
55 push %rbp
48 89 e5 mov %rsp,%rbp
48 c1 e8 0b shr $0xb,%rax
83 e0 01 and $0x1,%eax
5d pop %rbp
c3 retq
This patch fixes this via s/inline/__always_inline/.
Code size decrease after the patch is ~7k:
text data bss dec hex filename
92125002 20826048 36417536 149368586 8e72f0a vmlinux
92118087 20826112 36417536 149361735 8e71447 vmlinux7_pageops_after
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
ee91ef6173 |
bufferhead: force inlining of buffer head flag operations
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline
very small functions we expect to be inlined. See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
With this .config:
http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os,
set_buffer_foo(), clear_buffer_foo() and similar functions get deinlined
about 60 times. Examples of disassembly:
<set_buffer_mapped> (14 copies, 43 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 20 lock orb $0x20,(%rdi)
5d pop %rbp
c3 retq
<buffer_mapped> (3 copies, 34 calls):
48 8b 07 mov (%rdi),%rax
55 push %rbp
48 89 e5 mov %rsp,%rbp
48 c1 e8 05 shr $0x5,%rax
83 e0 01 and $0x1,%eax
5d pop %rbp
c3 retq
<set_buffer_new> (5 copies, 13 calls):
55 push %rbp
48 89 e5 mov %rsp,%rbp
f0 80 0f 40 lock orb $0x40,(%rdi)
5d pop %rbp
c3 retq
This patch fixes this via s/inline/__always_inline/.
This decreases vmlinux by about 3 kbytes.
text data bss dec hex filename
88200439 19905208 36421632 144527279 89d4faf vmlinux2
88197239 19905240 36421632 144524111 89d434f vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
075db1502f |
tools/vm/page-types.c: add memory cgroup dumping and filtering
This adds two command line keys: -c|--cgroup path|@inode Walk only pages owned by this memory cgroup -C|--list-cgroup Show memory cgroup inodes [vdavydov@virtuozzo.com: opt_cgroup should be uint64_t. Fix conflicts with "tools/vm/page-types.c: support swap entry"] Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
accf62422b |
mm, kswapd: replace kswapd compaction with waking up kcompactd
Similarly to direct reclaim/compaction, kswapd attempts to combine
reclaim and compaction to attempt making memory allocation of given
order available.
The details differ from direct reclaim e.g. in having high watermark as
a goal. The code involved in kswapd's reclaim/compaction decisions has
evolved to be quite complex.
Testing reveals that it doesn't actually work in at least one scenario,
and closer inspection suggests that it could be greatly simplified
without compromising on the goal (make high-order page available) or
efficiency (don't reclaim too much). The simplification relieas of
doing all compaction in kcompactd, which is simply woken up when high
watermarks are reached by kswapd's reclaim.
The scenario where kswapd compaction doesn't work was found with mmtests
test stress-highalloc configured to attempt order-9 allocations without
direct reclaim, just waking up kswapd. There was no compaction attempt
from kswapd during the whole test. Some added instrumentation shows
what happens:
- balance_pgdat() sets end_zone to Normal, as it's not balanced
- reclaim is attempted on DMA zone, which sets nr_attempted to 99, but
it cannot reclaim anything, so sc.nr_reclaimed is 0
- for zones DMA32 and Normal, kswapd_shrink_zone uses testorder=0, so
it merely checks if high watermarks were reached for base pages.
This is true, so no reclaim is attempted. For DMA, testorder=0
wasn't used, as compaction_suitable() returned COMPACT_SKIPPED
- even though the pgdat_needs_compaction flag wasn't set to false, no
compaction happens due to the condition sc.nr_reclaimed >
nr_attempted being false (as 0 < 99)
- priority-- due to nr_reclaimed being 0, repeat until priority reaches
0 pgdat_balanced() is false as only the small zone DMA appears
balanced (curiously in that check, watermark appears OK and
compaction_suitable() returns COMPACT_PARTIAL, because a lower
classzone_idx is used there)
Now, even if it was decided that reclaim shouldn't be attempted on the
DMA zone, the scenario would be the same, as (sc.nr_reclaimed=0 >
nr_attempted=0) is also false. The condition really should use >= as
the comment suggests. Then there is a mismatch in the check for setting
pgdat_needs_compaction to false using low watermark, while the rest uses
high watermark, and who knows what other subtlety. Hopefully this
demonstrates that this is unsustainable.
Luckily we can simplify this a lot. The reclaim/compaction decisions
make sense for direct reclaim scenario, but in kswapd, our primary goal
is to reach high watermark in order-0 pages. Afterwards we can attempt
compaction just once. Unlike direct reclaim, we don't reclaim extra
pages (over the high watermark), the current code already disallows it
for good reasons.
After this patch, we simply wake up kcompactd to process the pgdat,
after we have either succeeded or failed to reach the high watermarks in
kswapd, which goes to sleep. We pass kswapd's order and classzone_idx,
so kcompactd can apply the same criteria to determine which zones are
worth compacting. Note that we use the classzone_idx from
wakeup_kswapd(), not balanced_classzone_idx which can include higher
zones that kswapd tried to balance too, but didn't consider them in
pgdat_balanced().
Since kswapd now cannot create high-order pages itself, we need to
adjust how it determines the zones to be balanced. The key element here
is adding a "highorder" parameter to zone_balanced, which, when set to
false, makes it consider only order-0 watermark instead of the desired
higher order (this was done previously by kswapd_shrink_zone(), but not
elsewhere). This false is passed for example in pgdat_balanced().
Importantly, wakeup_kswapd() uses true to make sure kswapd and thus
kcompactd are woken up for a high-order allocation failure.
The last thing is to decide what to do with pageblock_skip bitmap
handling. Compaction maintains a pageblock_skip bitmap to record
pageblocks where isolation recently failed. This bitmap can be reset by
three ways:
1) direct compaction is restarting after going through the full deferred cycle
2) kswapd goes to sleep, and some other direct compaction has previously
finished scanning the whole zone and set zone->compact_blockskip_flush.
Note that a successful direct compaction clears this flag.
3) compaction was invoked manually via trigger in /proc
The case 2) is somewhat fuzzy to begin with, but after introducing
kcompactd we should update it. The check for direct compaction in 1),
and to set the flush flag in 2) use current_is_kswapd(), which doesn't
work for kcompactd. Thus, this patch adds bool direct_compaction to
compact_control to use in 2). For the case 1) we remove the check
completely - unlike the former kswapd compaction, kcompactd does use the
deferred compaction functionality, so flushing tied to restarting from
deferred compaction makes sense here.
Note that when kswapd goes to sleep, kcompactd is woken up, so it will
see the flushed pageblock_skip bits. This is different from when the
former kswapd compaction observed the bits and I believe it makes more
sense. Kcompactd can afford to be more thorough than a direct
compaction trying to limit allocation latency, or kswapd whose primary
goal is to reclaim.
For testing, I used stress-highalloc configured to do order-9
allocations with GFP_NOWAIT|__GFP_HIGH|__GFP_COMP, so they relied just
on kswapd/kcompactd reclaim/compaction (the interfering kernel builds in
phases 1 and 2 work as usual):
stress-highalloc
4.5-rc1+before 4.5-rc1+after
-nodirect -nodirect
Success 1 Min 1.00 ( 0.00%) 5.00 (-66.67%)
Success 1 Mean 1.40 ( 0.00%) 6.20 (-55.00%)
Success 1 Max 2.00 ( 0.00%) 7.00 (-16.67%)
Success 2 Min 1.00 ( 0.00%) 5.00 (-66.67%)
Success 2 Mean 1.80 ( 0.00%) 6.40 (-52.38%)
Success 2 Max 3.00 ( 0.00%) 7.00 (-16.67%)
Success 3 Min 34.00 ( 0.00%) 62.00 ( 1.59%)
Success 3 Mean 41.80 ( 0.00%) 63.80 ( 1.24%)
Success 3 Max 53.00 ( 0.00%) 65.00 ( 2.99%)
User 3166.67 3181.09
System 1153.37 1158.25
Elapsed 1768.53 1799.37
4.5-rc1+before 4.5-rc1+after
-nodirect -nodirect
Direct pages scanned 32938 32797
Kswapd pages scanned 2183166 2202613
Kswapd pages reclaimed 2152359 2143524
Direct pages reclaimed 32735 32545
Percentage direct scans 1% 1%
THP fault alloc 579 612
THP collapse alloc 304 316
THP splits 0 0
THP fault fallback 793 778
THP collapse fail 11 16
Compaction stalls 1013 1007
Compaction success 92 67
Compaction failures 920 939
Page migrate success 238457 721374
Page migrate failure 23021 23469
Compaction pages isolated 504695 1479924
Compaction migrate scanned 661390 8812554
Compaction free scanned 13476658 84327916
Compaction cost 262 838
After this patch we see improvements in allocation success rate
(especially for phase 3) along with increased compaction activity. The
compaction stalls (direct compaction) in the interfering kernel builds
(probably THP's) also decreased somewhat thanks to kcompactd activity,
yet THP alloc successes improved a bit.
Note that elapsed and user time isn't so useful for this benchmark,
because of the background interference being unpredictable. It's just
to quickly spot some major unexpected differences. System time is
somewhat more useful and that didn't increase.
Also (after adjusting mmtests' ftrace monitor):
Time kswapd awake 2547781 2269241
Time kcompactd awake 0 119253
Time direct compacting 939937 557649
Time kswapd compacting 0 0
Time kcompactd compacting 0 119099
The decrease of overal time spent compacting appears to not match the
increased compaction stats. I suspect the tasks get rescheduled and
since the ftrace monitor doesn't see that, the reported time is wall
time, not CPU time. But arguably direct compactors care about overall
latency anyway, whether busy compacting or waiting for CPU doesn't
matter. And that latency seems to almost halved.
It's also interesting how much time kswapd spent awake just going
through all the priorities and failing to even try compacting, over and
over.
We can also configure stress-highalloc to perform both direct
reclaim/compaction and wakeup kswapd/kcompactd, by using
GFP_KERNEL|__GFP_HIGH|__GFP_COMP:
stress-highalloc
4.5-rc1+before 4.5-rc1+after
-direct -direct
Success 1 Min 4.00 ( 0.00%) 9.00 (-50.00%)
Success 1 Mean 8.00 ( 0.00%) 10.00 (-19.05%)
Success 1 Max 12.00 ( 0.00%) 11.00 ( 15.38%)
Success 2 Min 4.00 ( 0.00%) 9.00 (-50.00%)
Success 2 Mean 8.20 ( 0.00%) 10.00 (-16.28%)
Success 2 Max 13.00 ( 0.00%) 11.00 ( 8.33%)
Success 3 Min 75.00 ( 0.00%) 74.00 ( 1.33%)
Success 3 Mean 75.60 ( 0.00%) 75.20 ( 0.53%)
Success 3 Max 77.00 ( 0.00%) 76.00 ( 0.00%)
User 3344.73 3246.04
System 1194.24 1172.29
Elapsed 1838.04 1836.76
4.5-rc1+before 4.5-rc1+after
-direct -direct
Direct pages scanned 125146 120966
Kswapd pages scanned 2119757 2135012
Kswapd pages reclaimed 2073183 2108388
Direct pages reclaimed 124909 120577
Percentage direct scans 5% 5%
THP fault alloc 599 652
THP collapse alloc 323 354
THP splits 0 0
THP fault fallback 806 793
THP collapse fail 17 16
Compaction stalls 2457 2025
Compaction success 906 518
Compaction failures 1551 1507
Page migrate success 2031423 2360608
Page migrate failure 32845 40852
Compaction pages isolated
|
||
|
|
e888ca3545 |
mm, memory hotplug: small cleanup in online_pages()
We can reuse the nid we've determined instead of repeated pfn_to_nid() usages. Also zone_to_nid() should be a bit cheaper in general than pfn_to_nid(). Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
698b1b3064 |
mm, compaction: introduce kcompactd
Memory compaction can be currently performed in several contexts: - kswapd balancing a zone after a high-order allocation failure - direct compaction to satisfy a high-order allocation, including THP page fault attemps - khugepaged trying to collapse a hugepage - manually from /proc The purpose of compaction is two-fold. The obvious purpose is to satisfy a (pending or future) high-order allocation, and is easy to evaluate. The other purpose is to keep overal memory fragmentation low and help the anti-fragmentation mechanism. The success wrt the latter purpose is more The current situation wrt the purposes has a few drawbacks: - compaction is invoked only when a high-order page or hugepage is not available (or manually). This might be too late for the purposes of keeping memory fragmentation low. - direct compaction increases latency of allocations. Again, it would be better if compaction was performed asynchronously to keep fragmentation low, before the allocation itself comes. - (a special case of the previous) the cost of compaction during THP page faults can easily offset the benefits of THP. - kswapd compaction appears to be complex, fragile and not working in some scenarios. It could also end up compacting for a high-order allocation request when it should be reclaiming memory for a later order-0 request. To improve the situation, we should be able to benefit from an equivalent of kswapd, but for compaction - i.e. a background thread which responds to fragmentation and the need for high-order allocations (including hugepages) somewhat proactively. One possibility is to extend the responsibilities of kswapd, which could however complicate its design too much. It should be better to let kswapd handle reclaim, as order-0 allocations are often more critical than high-order ones. Another possibility is to extend khugepaged, but this kthread is a single instance and tied to THP configs. This patch goes with the option of a new set of per-node kthreads called kcompactd, and lays the foundations, without introducing any new tunables. The lifecycle mimics kswapd kthreads, including the memory hotplug hooks. For compaction, kcompactd uses the standard compaction_suitable() and ompact_finished() criteria and the deferred compaction functionality. Unlike direct compaction, it uses only sync compaction, as there's no allocation latency to minimize. This patch doesn't yet add a call to wakeup_kcompactd. The kswapd compact/reclaim loop for high-order pages will be replaced by waking up kcompactd in the next patch with the description of what's wrong with the old approach. Waking up of the kcompactd threads is also tied to kswapd activity and follows these rules: - we don't want to affect any fastpaths, so wake up kcompactd only from the slowpath, as it's done for kswapd - if kswapd is doing reclaim, it's more important than compaction, so don't invoke kcompactd until kswapd goes to sleep - the target order used for kswapd is passed to kcompactd Future possible future uses for kcompactd include the ability to wake up kcompactd on demand in special situations, such as when hugepages are not available (currently not done due to __GFP_NO_KSWAPD) or when a fragmentation event (i.e. __rmqueue_fallback()) occurs. It's also possible to perform periodic compaction with kcompactd. [arnd@arndb.de: fix build errors with kcompactd] [paul.gortmaker@windriver.com: don't use modular references for non modular code] Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
81c5857b27 |
mm, kswapd: remove bogus check of balance_classzone_idx
During work on kcompactd integration I have spotted a confusing check of balance_classzone_idx, which I believe is bogus. The balanced_classzone_idx is filled by balance_pgdat() as the highest zone it attempted to balance. This was introduced by commit |
||
|
|
21c647865a |
tile: query dynamic DEBUG_PAGEALLOC setting
We can disable debug_pagealloc processing even if the code is compiled with CONFIG_DEBUG_PAGEALLOC. This patch changes the code to query whether it is enabled or not in runtime. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Takashi Iwai <tiwai@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
e7df0d88c4 |
powerpc: query dynamic DEBUG_PAGEALLOC setting
We can disable debug_pagealloc processing even if the code is compiled with CONFIG_DEBUG_PAGEALLOC. This patch changes the code to query whether it is enabled or not in runtime. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
505f6d22db |
sound: query dynamic DEBUG_PAGEALLOC setting
We can disable debug_pagealloc processing even if the code is compiled with CONFIG_DEBUG_PAGEALLOC. This patch changes the code to query whether it is enabled or not in runtime. [akpm@linux-foundation.org: export _debug_pagealloc_enabled to modules] Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Takashi Iwai <tiwai@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
922d566cdc |
mm/slub: query dynamic DEBUG_PAGEALLOC setting
We can disable debug_pagealloc processing even if the code is compiled with CONFIG_DEBUG_PAGEALLOC. This patch changes the code to query whether it is enabled or not in runtime. [akpm@linux-foundation.org: clean up code, per Christian] Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Takashi Iwai <tiwai@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
f48d97f340 |
mm/vmalloc: query dynamic DEBUG_PAGEALLOC setting
As CONFIG_DEBUG_PAGEALLOC can be enabled/disabled via kernel parameters we can optimize some cases by checking the enablement state. This is follow-up work for Christian's Optimize CONFIG_DEBUG_PAGEALLOC: https://lkml.org/lkml/2016/1/27/194 Remaining work is to make sparc to be aware of this but it looks not easy for me so I skip that in this series. This patch (of 5): We can disable debug_pagealloc processing even if the code is complied with CONFIG_DEBUG_PAGEALLOC. This patch changes the code to query whether it is enabled or not in runtime. [akpm@linux-foundation.org: update comment, per David. Adjust comment to use 80 cols] Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Takashi Iwai <tiwai@suse.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
0335ddd34f |
tools/vm/page-types.c: support swap entry
/proc/pid/pagemap (pte_to_pagemap_entry() internally) already reports about swap entry, so let's make the in-kernel utility aware of it. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
0a71649cb7 |
/proc/kpageflags: return KPF_SLAB for slab tail pages
Currently /proc/kpageflags returns just KPF_COMPOUND_TAIL for slab tail
pages, which is inconvenient when grasping how slab pages are
distributed (userspace always needs to check which kind of tail pages by
itself). This patch sets KPF_SLAB for such pages.
With this patch:
$ grep Slab /proc/meminfo ; tools/vm/page-types -b slab
Slab: 64880 kB
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000080 16220 63 _______S__________________________________ slab
total 16220 63
16220 pages equals to 64880 kB, so returned result is consistent with the
global counter.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
832fc1de01 |
/proc/kpageflags: return KPF_BUDDY for "tail" buddy pages
Currently /proc/kpageflags returns nothing for "tail" buddy pages, which
is inconvenient when grasping how free pages are distributed. This
patch sets KPF_BUDDY for such pages.
With this patch:
$ grep MemFree /proc/meminfo ; tools/vm/page-types -b buddy
MemFree: 3134992 kB
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000400 779272 3044 __________B_______________________________ buddy
0x0000000000000c00 4385 17 __________BM______________________________ buddy,mmap
total 783657 3061
783657 pages is 3134628 kB (roughly consistent with the global counter,)
so it's OK.
[akpm@linux-foundation.org: update comment, per Naoya]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com>>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
12580e4b54 |
mm: memcontrol: report kernel stack usage in cgroup2 memory.stat
Show how much memory is allocated to kernel stacks. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
27ee57c93f |
mm: memcontrol: report slab usage in cgroup2 memory.stat
Show how much memory is used for storing reclaimable and unreclaimable in-kernel data structures allocated from slab caches. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
72b54e7314 |
mm: memcontrol: make tree_{stat,events} fetch all stats
Currently, tree_{stat,events} helpers can only get one stat index at a
time, so when there are a lot of stats to be reported one has to call it
over and over again (see memory_stat_show). This is neither effective,
nor does it look good. Instead, let's make these helpers take a
snapshot of all available counters.
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
fcff7d7eeb |
mm: memcontrol: do not bypass slab charge if memcg is offline
Slab pages are charged in two steps. First, an appropriate per memcg cache is selected (see memcg_kmem_get_cache) basing on the current context, then the new slab page is charged to the memory cgroup which the selected cache was created for (see memcg_charge_slab -> __memcg_kmem_charge_memcg). It is OK to bypass kmemcg charge at step 1, but if step 1 succeeded and we successfully allocated a new slab page, step 2 must be performed, otherwise we would get a per memcg kmem cache which contains a slab that does not hold a reference to the memory cgroup owning the cache. Since per memcg kmem caches are destroyed on memcg css free, this could result in freeing a cache while there are still active objects in it. However, currently we will bypass slab page charge if the memory cgroup owning the cache is offline (see __memcg_kmem_charge_memcg). This is very unlikely to occur in practice, because for this to happen a process must be migrated to a different cgroup and the old cgroup must be removed while the process is in kmalloc somewhere between steps 1 and 2 (e.g. trying to allocate a new page). Nevertheless, it's still better to eliminate such a possibility. Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
6a618957ad |
mm: oom_kill: don't ignore oom score on exiting tasks
When the OOM killer scans tasks and encounters a PF_EXITING one, it force-selects that task regardless of the score. The problem is that if that task got stuck waiting for some state the allocation site is holding, the OOM reaper can not move on to the next best victim. Frankly, I don't even know why we check for exiting tasks in the OOM killer. We've tried direct reclaim at least 15 times by the time we decide the system is OOM, there was plenty of time to exit and free memory; and a task might exit voluntarily right after we issue a kill. This is testing pure noise. Remove it. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Argangeli <andrea@kernel.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
a1ee1932aa |
watchdog: don't run proc_watchdog_update if new value is same as old
While working on a script to restore all sysctl params before a series of
tests I found that writing any value into the
/proc/sys/kernel/{nmi_watchdog,soft_watchdog,watchdog,watchdog_thresh}
causes them to call proc_watchdog_update().
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
There doesn't appear to be a reason for doing this work every time a write
occurs, so only do it when the values change.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Acked-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Aaron Tomlin <atomlin@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: <stable@vger.kernel.org> [4.1.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
4c11e554fb |
drivers/firmware/broadcom/bcm47xx_nvram.c: fix incorrect __ioread32_copy
Commit |
||
|
|
b0f84ac352 |
ia64: define ioremap_uc()
All architectures now need ioremap_uc(), ia64 seems defines this already through its ioremap_nocache() and it already ensures it *only* uses UC. This is needed since v4.3 to complete an allyesconfig compile on ia64, there were others archs that needed this, and this one seems to have fallen through the cracks. Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org> Reported-by: kbuild test robot <fengguang.wu@intel.com> Acked-by: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> [4.3+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
09fd671ccb |
Merge tag 'fbdev-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull fbdev updates from Tomi Valkeinen: - Miscallaneous small fixes to various fbdev drivers - Remove fb_rotate, which was never used - pmag fb improvements * tag 'fbdev-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (21 commits) xen kconfig: don't "select INPUT_XEN_KBDDEV_FRONTEND" video: fbdev: sis: remove unused variable drivers/video: make fbdev/sunxvr2500.c explicitly non-modular drivers/video: make fbdev/sunxvr1000.c explicitly non-modular drivers/video: make fbdev/sunxvr500.c explicitly non-modular video: exynos: fix modular build fbdev: da8xx-fb: fix videomodes of lcd panels fbdev: kill fb_rotate video: fbdev: bt431: Correct cursor format control macro video: fbdev: pmag-ba-fb: Optimize Bt455 colormap addressing video: fbdev: pmag-ba-fb: Fix and rework Bt455 colormap handling video: fbdev: bt455: Remove unneeded colormap helpers for cursor support video: fbdev: pmag-aa-fb: Report video timings video: fbdev: pmag-aa-fb: Enable building as a module video: fbdev: pmag-aa-fb: Adapt to current APIs video: fbdev: pmag-ba-fb: Fix the lower margin size fbdev: sh_mobile_lcdc: Use ARCH_RENESAS fbdev: n411: check return value fbdev: exynos: fix IS_ERR_VALUE usage video: Use bool instead int pointer for get_opt_bool() argument ... |
||
|
|
bace3db5da |
Merge tag 'media/v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab: - Added support for some new video formats - mn88473 DVB frontend driver got promoted from staging - several improvements at the VSP1 driver - several cleanups and improvements at the Media Controller - added Media Controller support to snd-usb-audio. Currently, enabled only for au0828-based V4L2/DVB boards - Several improvements at nuvoton-cir: it now supports wake up codes - Add media controller support to em28xx and saa7134 drivers - coda driver now accepts NXP distributed firmware files - Some legacy SoC camera drivers will be moving to staging, as they're outdated and nobody so far is willing to fix and convert them to use the current media framework - As usual, lots of cleanups, improvements and new board additions. * tag 'media/v4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (381 commits) media: au0828 disable tuner to demod link in au0828_media_device_register() [media] touptek: cast char types on %x printk [media] touptek: don't DMA at the stack [media] mceusb: use %*ph for small buffer dumps [media] v4l: exynos4-is: Drop unneeded check when setting up fimc-lite links [media] v4l: vsp1: Check if an entity is a subdev with the right function [media] hide unused functions for !MEDIA_CONTROLLER [media] em28xx: fix Terratec Grabby AC97 codec detection [media] media: add prefixes to interface types [media] media: rc: nuvoton: switch attribute wakeup_data to text [media] v4l2-ioctl: fix YUV422P pixel format description [media] media: fix null pointer dereference in v4l_vb2q_enable_media_source() [media] v4l2-mc.h: fix yet more compiler errors [media] staging/media: add missing TODO files [media] media.h: always start with 1 for the audio entities [media] sound/usb: Use meaninful names for goto labels [media] v4l2-mc.h: fix compiler warnings [media] media: au0828 audio mixer isn't connected to decoder [media] sound/usb: Use Media Controller API to share media resources [media] dw2102: add support for TeVii S662 ... |
||
|
|
8759957b77 |
Merge tag 'libnvdimm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
- Asynchronous address range scrub:
Given the capacities of next generation persistent memory devices a
scrub operation to find all poison may take 10s of seconds. We
want this scrub work to be done asynchronously with the rest of
system initialization, so we move it out of line from the NFIT
probing, i.e. acpi_nfit_add().
- Clear poison:
ACPI 6.1 introduces the ability to send "clear error" commands to
the ACPI0012:00 device representing the root of an "nvdimm bus".
Similar to relocating a bad block on a disk, this support clears
media errors in response to a write.
- Persistent memory resource tracking:
A persistent memory range may be designated as simply "reserved" by
platform firmware in the efi/e820 memory map. Later when the NFIT
driver loads it discovers that the range is "Persistent Memory".
The NFIT bus driver inserts a resource to advertise that
"persistent" attribute in the system resource tree for /proc/iomem
and kernel-internal usages.
- Miscellaneous cleanups and fixes:
Workaround section misaligned pmem ranges when allocating a struct
page memmap, fix handling of the read-only case in the ioctl path,
and clean up block device major number allocation.
* tag 'libnvdimm-for-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (26 commits)
libnvdimm, pmem: clear poison on write
libnvdimm, pmem: fix kmap_atomic() leak in error path
nvdimm/btt: don't allocate unused major device number
nvdimm/blk: don't allocate unused major device number
pmem: don't allocate unused major device number
ACPI: Change NFIT driver to insert new resource
resource: Export insert_resource and remove_resource
resource: Add remove_resource interface
resource: Change __request_region to inherit from immediate parent
libnvdimm, pmem: fix ia64 build, use PHYS_PFN
nfit, libnvdimm: clear poison command support
libnvdimm, pfn: 'resource'-address and 'size' attributes for pfn devices
libnvdimm, pmem: adjust for section collisions with 'System RAM'
libnvdimm, pmem: fix 'pfn' support for section-misaligned namespaces
libnvdimm: Fix security issue with DSM IOCTL.
libnvdimm: Clean-up access mode check.
tools/testing/nvdimm: expand ars unit testing
nfit: disable userspace initiated ars during scrub
nfit: scrub and register regions in a workqueue
nfit, libnvdimm: async region scrub workqueue
...
|
||
|
|
6968e6f832 |
Merge tag 'dm-4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Most attention this cycle went to optimizing blk-mq request-based DM
(dm-mq) that is used exclussively by DM multipath:
- A stable fix for dm-mq that eliminates excessive context
switching offers the biggest performance improvement (for both
IOPs and throughput).
- But more work is needed, during the next cycle, to reduce
spinlock contention in DM multipath on large NUMA systems.
- A stable fix for a NULL pointer seen when DM stats is enabled on a DM
multipath device that must requeue an IO due to path failure.
- A stable fix for DM snapshot to disallow the COW and origin devices
from being identical. This amounts to graceful failure in the face
of userspace error because these devices shouldn't ever be identical.
- Stable fixes for DM cache and DM thin provisioning to address crashes
seen if/when their respective metadata device experiences failures
that cause the transition to 'fail_io' mode.
- The DM cache 'mq' policy is now an alias for the 'smq' policy. The
'smq' policy proved to be consistently better than 'mq'. As such
'mq', with all its complex user-facing tunables, has been eliminated.
- Improve DM thin provisioning to consistently return -ENOSPC once the
thin-pool's data volume is out of space.
- Improve DM core to properly handle error propagation if
bio_integrity_clone() fails in clone_bio().
- Other small cleanups and improvements to DM core.
* tag 'dm-4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (41 commits)
dm: fix rq_end_stats() NULL pointer in dm_requeue_original_request()
dm thin: consistently return -ENOSPC if pool has run out of data space
dm cache: bump the target version
dm cache: make sure every metadata function checks fail_io
dm: add missing newline between DM_DEBUG_BLOCK_STACK_TRACING and DM_BUFIO
dm cache policy smq: clarify that mq registration failure was for 'mq'
dm: return error if bio_integrity_clone() fails in clone_bio()
dm thin metadata: don't issue prefetches if a transaction abort has failed
dm snapshot: disallow the COW and origin devices from being identical
dm cache: make the 'mq' policy an alias for 'smq'
dm: drop unnecessary assignment of md->queue
dm: reorder 'struct mapped_device' members to fix alignment and holes
dm: remove dummy definition of 'struct dm_table'
dm: add 'dm_numa_node' module parameter
dm thin metadata: remove needless newline from subtree_dec() DMERR message
dm mpath: cleanup reinstate_path() et al based on code review
dm mpath: remove __pgpath_busy forward declaration, rename to pgpath_busy
dm mpath: switch from 'unsigned' to 'bool' for flags where appropriate
dm round robin: use percpu 'repeat_count' and 'current_path'
dm path selector: remove 'repeat_count' return from .select_path hook
...
|
||
|
|
cae8da047b |
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley: "This pull includes driver updates from the usual suspects (stex, hpsa, ncr5380, scsi_dh, qla2xxx, be2iscsi, hisi_sas, cxlflash, aacraid, mp3sas, megaraid_sas, ibmvscsi, ufs) plus an assortment of miscellaneous fixes. The major user visible change of this pull is that we've moved from monotonically increasing host number to an ida allocated one (meaning the numbers get re-used) because someone managed to wrap the count in an iscsi system. We don't believe there will be any adverse consequences of this" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (230 commits) MAINTAINERS: use new email address for James Bottomley mpt3sas: Remove unnecessary synchronize_irq() before free_irq() sg: fix dxferp in from_to case cxlflash: Increase cmd_per_lun for better throughput cxlflash: Fix to avoid unnecessary scan with internal LUNs cxlflash: Reorder user context initialization cxlflash: Simplify attach path error cleanup cxlflash: Split out context initialization cxlflash: Unmap problem state area before detaching master context cxlflash: Simplify PCI registration scsi: storvsc: fix SRB_STATUS_ABORTED handling be2iscsi: set the boot_kset pointer to NULL in case of failure sd: Fix discard granularity when LBPRZ=1 be2iscsi: Remove unnecessary synchronize_irq() before free_irq() scsi_sysfs: call 'device_add' after attaching device handler scsi_dh_emc: update 'access_state' field scsi_dh_rdac: update 'access_state' field scsi_dh_alua: update 'access_state' field scsi_dh_alua: use common definitions for ALUA state scsi: Add 'access_state' and 'preferred_path' attribute ... |
||
|
|
7bb7a74886 |
Merge branch 'stable/for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft
Pull iscsi_ibft update from Konrad Rzeszutek Wilk: "A simple patch that had been rattling around in SuSE repo" * 'stable/for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft: iscsi_ibft: Add prefix-len attr and display netmask |
||
|
|
63e30271b0 |
Merge tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"PCI changes for v4.6:
Enumeration:
- Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas)
- Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas
Resource management:
- Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
- Don't assign or reassign immutable resources (Bjorn Helgaas)
- Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas)
- Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas)
- Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas)
- ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas)
- ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
- MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
- Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas)
- Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas)
- rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
- designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
Virtualization:
- Wait for up to 1000ms after FLR reset (Alex Williamson)
- Support SR-IOV on any function type (Kelly Zytaruk)
- Add ACS quirk for all Cavium devices (Manish Jaggi)
AER:
- Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas)
- Restore pci_ops pointer while calling original pci_ops (David Daney)
- Fix aer_inject error codes (Jean Delvare)
- Use dev_warn() in aer_inject (Jean Delvare)
- Log actual error causes in aer_inject (Jean Delvare)
- Log aer_inject error injections (Jean Delvare)
VPD:
- Prevent VPD access for buggy devices (Babu Moger)
- Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas)
- Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas)
- Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas)
- Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas)
- Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas)
- Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas)
- Update VPD definitions (Hannes Reinecke)
- Allow access to VPD attributes with size 0 (Hannes Reinecke)
- Determine actual VPD size on first access (Hannes Reinecke)
Generic host bridge driver:
- Move structure definitions to separate header file (David Daney)
- Add pci_host_common_probe(), based on gen_pci_probe() (David Daney)
- Expose pci_host_common_probe() for use by other drivers (David Daney)
Altera host bridge driver:
- Fix altera_pcie_link_is_up() (Ley Foon Tan)
Cavium ThunderX host bridge driver:
- Add PCIe host driver for ThunderX processors (David Daney)
- Add driver for ThunderX-pass{1,2} on-chip devices (David Daney)
Freescale i.MX6 host bridge driver:
- Add DT bindings to configure PHY Tx driver settings (Justin Waters)
- Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach)
- Move PHY reset into imx6_pcie_establish_link() (Lucas Stach)
- Remove broken Gen2 workaround (Lucas Stach)
- Move link up check into imx6_pcie_wait_for_link() (Lucas Stach)
Freescale Layerscape host bridge driver:
- Add "fsl,ls2085a-pcie" compatible ID (Yang Shi)
Intel VMD host bridge driver:
- Attach VMD resources to parent domain's resource tree (Jon Derrick)
- Set bus resource start to 0 (Keith Busch)
Microsoft Hyper-V host bridge driver:
- Add fwnode_handle to x86 pci_sysdata (Jake Oshins)
- Look up IRQ domain by fwnode_handle (Jake Oshins)
- Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins)
NVIDIA Tegra host bridge driver:
- Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding)
- Implement ->{add,remove}_bus() callbacks (Thierry Reding)
- Remove unused struct tegra_pcie.num_ports field (Thierry Reding)
- Track bus -> CPU mapping (Thierry Reding)
- Remove misleading PHYS_OFFSET (Thierry Reding)
Renesas R-Car host bridge driver:
- Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman)
Synopsys DesignWare host bridge driver:
- ARC: Add PCI support (Joao Pinto)
- Add generic dw_pcie_wait_for_link() (Joao Pinto)
- Add default link up check if sub-driver doesn't override (Joao Pinto)
- Add driver for prototyping kits based on ARC SDP (Joao Pinto)
TI Keystone host bridge driver:
- Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin)
Xilinx AXI host bridge driver:
- Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada)
- Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada)
- Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada)
- Update Zynq binding with Microblaze node (Bharat Kumar Gogada)
- microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada)
Xilinx NWL host bridge driver:
- Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada)
Miscellaneous:
- Check device_attach() return value always (Bjorn Helgaas)
- Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas)
- Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas)
- ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas)
- Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas)
- Remove includes of asm/pci-bridge.h (Bjorn Helgaas)
- Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas)
- unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas)
- Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler)
- Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas)
- Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa)
- frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig)
- Move pci_dma_* helpers to common code (Christoph Hellwig)
- Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus)
- Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson)
- Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)"
* tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits)
PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
PCI: designware: Add driver for prototyping kits based on ARC SDP
PCI: designware: Add default link up check if sub-driver doesn't override
PCI: designware: Add generic dw_pcie_wait_for_link()
PCI: Cleanup pci/pcie/Kconfig whitespace
PCI: Simplify pci_create_attr() control flow
PCI: Don't leak memory if sysfs_create_bin_file() fails
PCI: Simplify sysfs ROM cleanup
PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
ia64/PCI: Use ioremap() instead of open-coded equivalent
ia64/PCI: Use temporary struct resource * to avoid repetition
PCI: Clean up pci_map_rom() whitespace
PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
PCI: thunder: Add driver for ThunderX-pass{1,2} on-chip devices
PCI: thunder: Add PCIe host driver for ThunderX processors
PCI: generic: Expose pci_host_common_probe() for use by other drivers
PCI: generic: Add pci_host_common_probe(), based on gen_pci_probe()
...
|
||
|
|
277edbabf6 |
Merge tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"This time the majority of changes go into cpufreq and they are
significant.
First off, the way CPU frequency updates are triggered is different
now. Instead of having to set up and manage a deferrable timer for
each CPU in the system to evaluate and possibly change its frequency
periodically, cpufreq governors set up callbacks to be invoked by the
scheduler on a regular basis (basically on utilization updates). The
"old" governors, "ondemand" and "conservative", still do all of their
work in process context (although that is triggered by the scheduler
now), but intel_pstate does it all in the callback invoked by the
scheduler with no need for any additional asynchronous processing.
Of course, this eliminates the overhead related to the management of
all those timers, but also it allows the cpufreq governor code to be
simplified quite a bit. On top of that, the common code and data
structures used by the "ondemand" and "conservative" governors are
cleaned up and made more straightforward and some long-standing and
quite annoying problems are addressed. In particular, the handling of
governor sysfs attributes is modified and the related locking becomes
more fine grained which allows some concurrency problems to be avoided
(particularly deadlocks with the core cpufreq code).
In principle, the new mechanism for triggering frequency updates
allows utilization information to be passed from the scheduler to
cpufreq. Although the current code doesn't make use of it, in the
works is a new cpufreq governor that will make decisions based on the
scheduler's utilization data. That should allow the scheduler and
cpufreq to work more closely together in the long run.
In addition to the core and governor changes, cpufreq drivers are
updated too. Fixes and optimizations go into intel_pstate, the
cpufreq-dt driver is updated on top of some modification in the
Operating Performance Points (OPP) framework and there are fixes and
other updates in the powernv cpufreq driver.
Apart from the cpufreq updates there is some new ACPICA material,
including a fix for a problem introduced by previous ACPICA updates,
and some less significant changes in the ACPI code, like CPPC code
optimizations, ACPI processor driver cleanups and support for loading
ACPI tables from initrd.
Also updated are the generic power domains framework, the Intel RAPL
power capping driver and the turbostat utility and we have a bunch of
traditional assorted fixes and cleanups.
Specifics:
- Redesign of cpufreq governors and the intel_pstate driver to make
them use callbacks invoked by the scheduler to trigger CPU
frequency evaluation instead of using per-CPU deferrable timers for
that purpose (Rafael Wysocki).
- Reorganization and cleanup of cpufreq governor code to make it more
straightforward and fix some concurrency problems in it (Rafael
Wysocki, Viresh Kumar).
- Cleanup and improvements of locking in the cpufreq core (Viresh
Kumar).
- Assorted cleanups in the cpufreq core (Rafael Wysocki, Viresh
Kumar, Eric Biggers).
- intel_pstate driver updates including fixes, optimizations and a
modification to make it enable enable hardware-coordinated P-state
selection (HWP) by default if supported by the processor (Philippe
Longepe, Srinivas Pandruvada, Rafael Wysocki, Viresh Kumar, Felipe
Franciosi).
- Operating Performance Points (OPP) framework updates to improve its
handling of voltage regulators and device clocks and updates of the
cpufreq-dt driver on top of that (Viresh Kumar, Jon Hunter).
- Updates of the powernv cpufreq driver to fix initialization and
cleanup problems in it and correct its worker thread handling with
respect to CPU offline, new powernv_throttle tracepoint (Shilpasri
Bhat).
- ACPI cpufreq driver optimization and cleanup (Rafael Wysocki).
- ACPICA updates including one fix for a regression introduced by
previos changes in the ACPICA code (Bob Moore, Lv Zheng, David Box,
Colin Ian King).
- Support for installing ACPI tables from initrd (Lv Zheng).
- Optimizations of the ACPI CPPC code (Prashanth Prakash, Ashwin
Chaugule).
- Support for _HID(ACPI0010) devices (ACPI processor containers) and
ACPI processor driver cleanups (Sudeep Holla).
- Support for ACPI-based enumeration of the AMBA bus (Graeme Gregory,
Aleksey Makarov).
- Modification of the ACPI PCI IRQ management code to make it treat
255 in the Interrupt Line register as "not connected" on x86 (as
per the specification) and avoid attempts to use that value as a
valid interrupt vector (Chen Fan).
- ACPI APEI fixes related to resource leaks (Josh Hunt).
- Removal of modularity from a few ACPI drivers (BGRT, GHES,
intel_pmic_crc) that cannot be built as modules in practice (Paul
Gortmaker).
- PNP framework update to make it treat ACPI_RESOURCE_TYPE_SERIAL_BUS
as a valid resource type (Harb Abdulhamid).
- New device ID (future AMD I2C controller) in the ACPI driver for
AMD SoCs (APD) and in the designware I2C driver (Xiangliang Yu).
- Assorted ACPI cleanups (Colin Ian King, Kaiyen Chang, Oleg Drokin).
- cpuidle menu governor optimization to avoid a square root
computation in it (Rasmus Villemoes).
- Fix for potential use-after-free in the generic device properties
framework (Heikki Krogerus).
- Updates of the generic power domains (genpd) framework including
support for multiple power states of a domain, fixes and debugfs
output improvements (Axel Haslam, Jon Hunter, Laurent Pinchart,
Geert Uytterhoeven).
- Intel RAPL power capping driver updates to reduce IPI overhead in
it (Jacob Pan).
- System suspend/hibernation code cleanups (Eric Biggers, Saurabh
Sengar).
- Year 2038 fix for the process freezer (Abhilash Jindal).
- turbostat utility updates including new features (decoding of more
registers and CPUID fields, sub-second intervals support, GFX MHz
and RC6 printout, --out command line option), fixes (syscall jitter
detection and workaround, reductioin of the number of syscalls
made, fixes related to Xeon x200 processors, compiler warning
fixes) and cleanups (Len Brown, Hubert Chrzaniuk, Chen Yu)"
* tag 'pm+acpi-4.6-rc1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (182 commits)
tools/power turbostat: bugfix: TDP MSRs print bits fixing
tools/power turbostat: correct output for MSR_NHM_SNB_PKG_CST_CFG_CTL dump
tools/power turbostat: call __cpuid() instead of __get_cpuid()
tools/power turbostat: indicate SMX and SGX support
tools/power turbostat: detect and work around syscall jitter
tools/power turbostat: show GFX%rc6
tools/power turbostat: show GFXMHz
tools/power turbostat: show IRQs per CPU
tools/power turbostat: make fewer systems calls
tools/power turbostat: fix compiler warnings
tools/power turbostat: add --out option for saving output in a file
tools/power turbostat: re-name "%Busy" field to "Busy%"
tools/power turbostat: Intel Xeon x200: fix turbo-ratio decoding
tools/power turbostat: Intel Xeon x200: fix erroneous bclk value
tools/power turbostat: allow sub-sec intervals
ACPI / APEI: ERST: Fixed leaked resources in erst_init
ACPI / APEI: Fix leaked resources
intel_pstate: Do not skip samples partially
intel_pstate: Remove freq calculation from intel_pstate_calc_busy()
intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()
...
|
||
|
|
271ecc5253 |
Merge branch 'akpm' (patches from Andrew)
Merge first patch-bomb from Andrew Morton: - some misc things - ofs2 updates - about half of MM - checkpatch updates - autofs4 update * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (120 commits) autofs4: fix string.h include in auto_dev-ioctl.h autofs4: use pr_xxx() macros directly for logging autofs4: change log print macros to not insert newline autofs4: make autofs log prints consistent autofs4: fix some white space errors autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked() autofs4: fix coding style line length in autofs4_wait() autofs4: fix coding style problem in autofs4_get_set_timeout() autofs4: coding style fixes autofs: show pipe inode in mount options kallsyms: add support for relative offsets in kallsyms address table kallsyms: don't overload absolute symbol type for percpu symbols x86: kallsyms: disable absolute percpu symbols on !SMP checkpatch: fix another left brace warning checkpatch: improve UNSPECIFIED_INT test for bare signed/unsigned uses checkpatch: warn on bare unsigned or signed declarations without int checkpatch: exclude asm volatile from complex macro check mm: memcontrol: drop unnecessary lru locking from mem_cgroup_migrate() mm: migrate: consolidate mem_cgroup_migrate() calls mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous ... |
||
|
|
aa6865d836 |
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Fix misspellings in comments. m68k: Use conventional function parameters for do_sigreturn zorro: Use kobj_to_dev() |