David Hildenbrand
7cf603d17d
kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED
...
IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is
always set to 0 by hardware. This is far from beautiful (and confusing),
and the bit only applies to SYSRAM. So let's move it out of the
bus-specific (PnP) defined bits.
We'll add another SYSRAM specific bit soon. If we ever need more bits for
other purposes, we can steal some from "desc", or reshuffle/regroup what
we have.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Michal Hocko <mhocko@suse.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Jason Gunthorpe <jgg@ziepe.ca >
Cc: Kees Cook <keescook@chromium.org >
Cc: Ard Biesheuvel <ardb@kernel.org >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Wei Yang <richardw.yang@linux.intel.com >
Cc: Eric Biederman <ebiederm@xmission.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org >
Cc: Anton Blanchard <anton@ozlabs.org >
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org >
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com >
Cc: Christian Borntraeger <borntraeger@de.ibm.com >
Cc: Dave Jiang <dave.jiang@intel.com >
Cc: Haiyang Zhang <haiyangz@microsoft.com >
Cc: Heiko Carstens <hca@linux.ibm.com >
Cc: Jason Wang <jasowang@redhat.com >
Cc: Juergen Gross <jgross@suse.com >
Cc: Julien Grall <julien@xen.org >
Cc: "K. Y. Srinivasan" <kys@microsoft.com >
Cc: Len Brown <lenb@kernel.org >
Cc: Leonardo Bras <leobras.c@gmail.com >
Cc: Libor Pechacek <lpechacek@suse.cz >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: "Michael S. Tsirkin" <mst@redhat.com >
Cc: Nathan Lynch <nathanl@linux.ibm.com >
Cc: "Oliver O'Halloran" <oohall@gmail.com >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Pingfan Liu <kernelfans@gmail.com >
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net >
Cc: Roger Pau Monné <roger.pau@citrix.com >
Cc: Stefano Stabellini <sstabellini@kernel.org >
Cc: Stephen Hemminger <sthemmin@microsoft.com >
Cc: Vasily Gorbik <gor@linux.ibm.com >
Cc: Vishal Verma <vishal.l.verma@intel.com >
Cc: Wei Liu <wei.liu@kernel.org >
Link: https://lkml.kernel.org/r/20200911103459.10306-3-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:18 -07:00
David Hildenbrand
ec62d04e3f
kernel/resource: make release_mem_region_adjustable() never fail
...
Patch series "selective merging of system ram resources", v4.
Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.
This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space). We really want to merge
added resources in this scenario where possible.
Resources are effectively stored in a list-based tree. Having a lot of
resources not only wastes memory, it also makes traversing that tree more
expensive, and makes /proc/iomem explode in size (e.g., requiring
kexec-tools to manually merge resources when creating a kdump header. The
current kexec-tools resource count limit does not allow for more than
~100GB of memory with a memory block size of 128MB on x86-64).
Let's allow to selectively merge system ram resources by specifying a new
flag for add_memory*(). Patch #5 contains a /proc/iomem example. Only
tested with virtio-mem.
This patch (of 8):
Let's make sure splitting a resource on memory hotunplug will never fail.
This will become more relevant once we merge selected System RAM resources
- then, we'll trigger that case more often on memory hotunplug.
In general, this function is already unlikely to fail. When we remove
memory, we free up quite a lot of metadata (memmap, page tables, memory
block device, etc.). The only reason it could really fail would be when
injecting allocation errors.
All other error cases inside release_mem_region_adjustable() seem to be
sanity checks if the function would be abused in different context - let's
add WARN_ON_ONCE() in these cases so we can catch them.
[natechancellor@gmail.com: fix use of ternary condition in release_mem_region_adjustable]
Link: https://lkml.kernel.org/r/20200922060748.2452056-1-natechancellor@gmail.com
Link: https://github.com/ClangBuiltLinux/linux/issues/1159
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Michal Hocko <mhocko@suse.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Jason Gunthorpe <jgg@ziepe.ca >
Cc: Kees Cook <keescook@chromium.org >
Cc: Ard Biesheuvel <ardb@kernel.org >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Wei Yang <richardw.yang@linux.intel.com >
Cc: Anton Blanchard <anton@ozlabs.org >
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org >
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com >
Cc: Christian Borntraeger <borntraeger@de.ibm.com >
Cc: Dave Jiang <dave.jiang@intel.com >
Cc: Eric Biederman <ebiederm@xmission.com >
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org >
Cc: Haiyang Zhang <haiyangz@microsoft.com >
Cc: Heiko Carstens <hca@linux.ibm.com >
Cc: Jason Wang <jasowang@redhat.com >
Cc: Juergen Gross <jgross@suse.com >
Cc: Julien Grall <julien@xen.org >
Cc: "K. Y. Srinivasan" <kys@microsoft.com >
Cc: Len Brown <lenb@kernel.org >
Cc: Leonardo Bras <leobras.c@gmail.com >
Cc: Libor Pechacek <lpechacek@suse.cz >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: "Michael S. Tsirkin" <mst@redhat.com >
Cc: Nathan Lynch <nathanl@linux.ibm.com >
Cc: "Oliver O'Halloran" <oohall@gmail.com >
Cc: Paul Mackerras <paulus@samba.org >
Cc: Pingfan Liu <kernelfans@gmail.com >
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net >
Cc: Roger Pau Monn <roger.pau@citrix.com >
Cc: Stefano Stabellini <sstabellini@kernel.org >
Cc: Stephen Hemminger <sthemmin@microsoft.com >
Cc: Thomas Gleixner <tglx@linutronix.de >
Cc: Vasily Gorbik <gor@linux.ibm.com >
Cc: Vishal Verma <vishal.l.verma@intel.com >
Cc: Wei Liu <wei.liu@kernel.org >
Link: https://lkml.kernel.org/r/20200911103459.10306-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
b30c59279d
mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory
...
Currently, it can happen that pages are allocated (and freed) via the
buddy before we finished basic memory onlining.
For example, pages are exposed to the buddy and can be allocated before we
actually mark the sections online. Allocated pages could suddenly fail
pfn_to_online_page() checks. We had similar issues with pcp handling,
when pages are allocated+freed before we reach zone_pcp_update() in
online_pages() [1].
Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are
impossible. Once done with the heavy lifting, use
undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE
freelist, marking them ready for allocation. Similar to offline_pages(),
we have to manually adjust zone->nr_isolate_pageblock.
[1] https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200819175957.28465-11-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
d882c0067d
mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone()
...
On the memory onlining path, we want to start with MIGRATE_ISOLATE, to
un-isolate the pages after memory onlining is complete. Let's allow
passing in the migratetype.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Tony Luck <tony.luck@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Michel Lespinasse <walken@google.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Mel Gorman <mgorman@techsingularity.net >
Link: https://lkml.kernel.org/r/20200819175957.28465-10-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
4eb29bd9d0
mm/page_alloc: drop stale pageblock comment in memmap_init_zone*()
...
Commit ac5d2539b2
("mm: meminit: reduce number of times pageblocks are
set during struct page init") moved the actual zone range check, leaving
only the alignment check for pageblocks.
Let's drop the stale comment and make the pageblock check easier to read.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200819175957.28465-9-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
aac65321ba
mm/memory_hotplug: simplify page onlining
...
We don't allow to offline memory with holes, all boot memory is online,
and all hotplugged memory cannot have holes.
We can now simplify onlining of pages. As we only allow to online/offline
full sections and sections always span full MAX_ORDER_NR_PAGES, we can
just process MAX_ORDER - 1 pages without further special handling.
The number of onlined pages simply corresponds to the number of pages we
were requested to online.
While at it, refine the comment regarding the callback not exposing all
pages to the buddy.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200819175957.28465-8-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
3fa0c7c79d
mm/page_isolation: simplify return value of start_isolate_page_range()
...
Callers no longer need the number of isolated pageblocks. Let's simplify.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200819175957.28465-7-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
ea15153c3d
mm/memory_hotplug: drop nr_isolate_pageblock in offline_pages()
...
We make sure that we cannot have any memory holes right at the beginning
of offline_pages() and we only support to online/offline full sections.
Both, sections and pageblocks are a power of two in size, and sections
always span full pageblocks.
We can directly calculate the number of isolated pageblocks from nr_pages.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200819175957.28465-6-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
257bea7158
mm/page_alloc: simplify __offline_isolated_pages()
...
offline_pages() is the only user. __offline_isolated_pages() never gets
called with ranges that contain memory holes and we no longer care about
the return value. Drop the return value handling and all pfn_valid()
checks.
Update the documentation.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200819175957.28465-5-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
0a1a9a0008
mm/memory_hotplug: simplify page offlining
...
We make sure that we cannot have any memory holes right at the beginning
of offline_pages(). We no longer need walk_system_ram_range() and can
call test_pages_isolated() and __offline_isolated_pages() directly.
offlined_pages always corresponds to nr_pages, so we can simplify that.
[akpm@linux-foundation.org: patch conflict resolution]
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Charan Teja Reddy <charante@codeaurora.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michel Lespinasse <walken@google.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200819175957.28465-4-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
4986fac160
mm/memory_hotplug: enforce section granularity when onlining/offlining
...
Already two people (including me) tried to offline subsections, because
the function looks like it can deal with it. But we really can only
online/offline full sections that are properly aligned (e.g., we can only
mark full sections online/offline via SECTION_IS_ONLINE).
Add a simple safety net to document the restriction now. Current users
(core and powernv/memtrace) respect these restrictions.
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200819175957.28465-3-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
David Hildenbrand
73a11c9658
mm/memory_hotplug: inline __offline_pages() into offline_pages()
...
Patch series "mm/memory_hotplug: online_pages()/offline_pages() cleanups", v2.
These are a bunch of cleanups for online_pages()/offline_pages() and
related code, mostly getting rid of memory hole handling that is no longer
necessary. There is only a single walk_system_ram_range() call left in
offline_pages(), to make sure we don't have any memory holes. I had some
of these patches lying around for a longer time but didn't have time to
polish them.
In addition, the last patch marks all pageblocks of memory to get onlined
MIGRATE_ISOLATE, so pages that have just been exposed to the buddy cannot
get allocated before onlining is complete. Once heavy lifting is done,
the pageblocks are set to MIGRATE_MOVABLE, such that allocations are
possible.
I played with DIMMs and virtio-mem on x86-64 and didn't spot any
surprises. I verified that the numer of isolated pageblocks is correctly
handled when onlining/offlining.
This patch (of 10):
There is only a single user, offline_pages(). Let's inline, to make
it look more similar to online_pages().
Signed-off-by: David Hildenbrand <david@redhat.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Oscar Salvador <osalvador@suse.de >
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Acked-by: Michal Hocko <mhocko@suse.com >
Cc: Wei Yang <richard.weiyang@linux.alibaba.com >
Cc: Baoquan He <bhe@redhat.com >
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com >
Cc: Charan Teja Reddy <charante@codeaurora.org >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Fenghua Yu <fenghua.yu@intel.com >
Cc: Logan Gunthorpe <logang@deltatee.com >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Mel Gorman <mgorman@techsingularity.net >
Cc: Michal Hocko <mhocko@suse.com >
Cc: Michel Lespinasse <walken@google.com >
Cc: Mike Rapoport <rppt@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Cc: Mel Gorman <mgorman@suse.de >
Link: https://lkml.kernel.org/r/20200819175957.28465-1-david@redhat.com
Link: https://lkml.kernel.org/r/20200819175957.28465-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Jann Horn
c9682d1027
mm/mmu_notifier: fix mmget() assert in __mmu_interval_notifier_insert
...
The comment talks about having to hold mmget() (which means mm_users), but
the actual check is on mm_count (which would be mmgrab()).
Given that MMU notifiers are torn down in mmput() -> __mmput() ->
exit_mmap() -> mmu_notifier_release(), I believe that the comment is
correct and the check should be on mm->mm_users. Fix it up accordingly.
Fixes: 99cb252f5e
("mm/mmu_notifier: add an interval tree notifier")
Signed-off-by: Jann Horn <jannh@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com >
Cc: John Hubbard <jhubbard@nvidia.com >
Cc: Christoph Hellwig <hch@lst.de >
Cc: Christian König <christian.koenig@amd.com
Link: https://lkml.kernel.org/r/20200901000143.207585-1-jannh@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Bartosz Golaszewski
295a173023
mm/util.c: update the kerneldoc for kstrdup_const()
...
Memory allocated with kstrdup_const() must not be passed to regular
krealloc() as it is not aware of the possibility of the chunk residing in
.rodata. Since there are no potential users of krealloc_const() at the
moment, let's just update the doc to make it explicit.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Andrew Morton <akpm@linux-foundation.org >
Link: http://lkml.kernel.org/r/20200817173927.23389-1-brgl@bgdev.pl
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Miaohe Lin
406100762a
mm/vmstat.c: use helper macro abs()
...
Use helper macro abs() to simplify the "x > t || x < -t" cmp.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Andrew Morton <akpm@linux-foundation.org >
Link: https://lkml.kernel.org/r/20200905084008.15748-1-linmiaohe@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Mateusz Nosek
11c9c7edae
mm/page_poison.c: replace bool variable with static key
...
Variable 'want_page_poisoning' is a switch deciding if page poisoning
should be enabled. This patch changes it to be static key.
Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com >
Cc: Oscar Salvador <OSalvador@suse.com >
Link: https://lkml.kernel.org/r/20200921152931.938-1-mateusznosek0@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Oscar Salvador
b94e02822d
mm,hwpoison: try to narrow window race for free pages
...
Aristeu Rozanski reported that a customer test case started to report
-EBUSY after the hwpoison rework patchset.
There is a race window between spotting a free page and taking it off its
buddy freelist, so it might be that by the time we try to take it off, the
page has been already allocated.
This patch tries to handle such race window by trying to handle the new
type of page again if the page was allocated under us.
Reported-by: Aristeu Rozanski <aris@ruivo.org >
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Tested-by: Aristeu Rozanski <aris@ruivo.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-15-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Naoya Horiguchi
1f2481ddbe
mm,hwpoison: double-check page count in __get_any_page()
...
Soft offlining could fail with EIO due to the race condition with hugepage
migration. This issuse became visible due to the change by previous patch
that makes soft offline handler take page refcount by its own. We have no
way to directly pin zero refcount page, and the page considered as a zero
refcount page could be allocated just after the first check.
This patch adds the second check to find the race and gives us chance to
handle it more reliably.
Reported-by: Qian Cai <cai@lca.pw >
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-14-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Naoya Horiguchi
5d1fd5dc87
mm,hwpoison: introduce MF_MSG_UNSPLIT_THP
...
memory_failure() is supposed to call action_result() when it handles a
memory error event, but there's one missing case. So let's add it.
I find that include/ras/ras_event.h has some other MF_MSG_* undefined, so
this patch also adds them.
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-13-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:17 -07:00
Oscar Salvador
5a2ffca3c2
mm,hwpoison: return 0 if the page is already poisoned in soft-offline
...
Currently, there is an inconsistency when calling soft-offline from
different paths on a page that is already poisoned.
1) madvise:
madvise_inject_error skips any poisoned page and continues
the loop.
If that was the only page to madvise, it returns 0.
2) /sys/devices/system/memory/:
When calling soft_offline_page_store()->soft_offline_page(),
we return -EBUSY in case the page is already poisoned.
This is inconsistent with a) the above example and b)
memory_failure, where we return 0 if the page was poisoned.
Fix this by dropping the PageHWPoison() check in madvise_inject_error, and
let soft_offline_page return 0 if it finds the page already poisoned.
Please, note that this represents a user-api change, since now the return
error when calling soft_offline_page_store()->soft_offline_page() will be
different.
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-12-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Oscar Salvador
6b9a217eda
mm,hwpoison: refactor soft_offline_huge_page and __soft_offline_page
...
Merging soft_offline_huge_page and __soft_offline_page let us get rid of
quite some duplicated code, and makes the code much easier to follow.
Now, __soft_offline_page will handle both normal and hugetlb pages.
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-11-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Oscar Salvador
79f5f8fab4
mm,hwpoison: rework soft offline for in-use pages
...
This patch changes the way we set and handle in-use poisoned pages. Until
now, poisoned pages were released to the buddy allocator, trusting that
the checks that take place at allocation time would act as a safe net and
would skip that page.
This has proved to be wrong, as we got some pfn walkers out there, like
compaction, that all they care is the page to be in a buddy freelist.
Although this might not be the only user, having poisoned pages in the
buddy allocator seems a bad idea as we should only have free pages that
are ready and meant to be used as such.
Before explaining the taken approach, let us break down the kind of pages
we can soft offline.
- Anonymous THP (after the split, they end up being 4K pages)
- Hugetlb
- Order-0 pages (that can be either migrated or invalited)
* Normal pages (order-0 and anon-THP)
- If they are clean and unmapped page cache pages, we invalidate
then by means of invalidate_inode_page().
- If they are mapped/dirty, we do the isolate-and-migrate dance.
Either way, do not call put_page directly from those paths. Instead, we
keep the page and send it to page_handle_poison to perform the right
handling.
page_handle_poison sets the HWPoison flag and does the last put_page.
Down the chain, we placed a check for HWPoison page in
free_pages_prepare, that just skips any poisoned page, so those pages
do not end up in any pcplist/freelist.
After that, we set the refcount on the page to 1 and we increment
the poisoned pages counter.
If we see that the check in free_pages_prepare creates trouble, we can
always do what we do for free pages:
- wait until the page hits buddy's freelists
- take it off, and flag it
The downside of the above approach is that we could race with an
allocation, so by the time we want to take the page off the buddy, the
page has been already allocated so we cannot soft offline it.
But the user could always retry it.
* Hugetlb pages
- We isolate-and-migrate them
After the migration has been successful, we call dissolve_free_huge_page,
and we set HWPoison on the page if we succeed.
Hugetlb has a slightly different handling though.
While for non-hugetlb pages we cared about closing the race with an
allocation, doing so for hugetlb pages requires quite some additional
and intrusive code (we would need to hook in free_huge_page and some other
places).
So I decided to not make the code overly complicated and just fail
normally if the page we allocated in the meantime.
We can always build on top of this.
As a bonus, because of the way we handle now in-use pages, we no longer
need the put-as-isolation-migratetype dance, that was guarding for poisoned
pages to end up in pcplists.
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-10-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Oscar Salvador
06be6ff3d2
mm,hwpoison: rework soft offline for free pages
...
When trying to soft-offline a free page, we need to first take it off the
buddy allocator. Once we know is out of reach, we can safely flag it as
poisoned.
take_page_off_buddy will be used to take a page meant to be poisoned off
the buddy allocator. take_page_off_buddy calls break_down_buddy_pages,
which splits a higher-order page in case our page belongs to one.
Once the page is under our control, we call page_handle_poison to set it
as poisoned and grab a refcount on it.
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-9-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Oscar Salvador
694bf0b0cd
mm,hwpoison: unify THP handling for hard and soft offline
...
Place the THP's page handling in a helper and use it from both hard and
soft-offline machinery, so we get rid of some duplicated code.
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-8-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Oscar Salvador
dd6e2402fa
mm,hwpoison: kill put_hwpoison_page
...
After commit 4e41a30c6d
("mm: hwpoison: adjust for new thp
refcounting"), put_hwpoison_page got reduced to a put_page. Let us just
use put_page instead.
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-7-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Oscar Salvador
dc7560b496
mm,hwpoison: refactor madvise_inject_error
...
Make a proper if-else condition for {hard,soft}-offline.
Signed-off-by: Oscar Salvador <osalvador@suse.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Link: https://lkml.kernel.org/r/20200908075626.11976-3-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Oscar Salvador
7e27f22c9e
mm,hwpoison: unexport get_hwpoison_page and make it static
...
Since get_hwpoison_page is only used in memory-failure code now, let us
un-export it and make it private to that code.
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-5-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Naoya Horiguchi
fd476720c9
mm,hwpoison-inject: don't pin for hwpoison_filter
...
Another memory error injection interface debugfs:hwpoison/corrupt-pfn also
takes bogus refcount for hwpoison_filter(). It's justified because this
does a coarse filter, expecting that memory_failure() redoes the check for
sure.
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-4-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Naoya Horiguchi
1b473becde
mm, hwpoison: remove recalculating hpage
...
hpage is never used after try_to_split_thp_page() in memory_failure(), so
we don't have to update hpage. So let's not recalculate/use hpage.
Suggested-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Oscar Salvador <osalvador@suse.com >
Cc: Qian Cai <cai@lca.pw >
Cc: Tony Luck <tony.luck@intel.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-3-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Naoya Horiguchi
7d9d46ac87
mm,hwpoison: cleanup unused PageHuge() check
...
Patch series "HWPOISON: soft offline rework", v7.
This patchset fixes a couple of issues that the patchset Naoya sent [1]
contained due to rebasing problems and a misunterdansting.
Main focus of this series is to stabilize soft offline. Historically soft
offlined pages have suffered from racy conditions because PageHWPoison is
used to a little too aggressively, which (directly or indirectly) invades
other mm code which cares little about hwpoison. This results in
unexpected behavior or kernel panic, which is very far from soft offline's
"do not disturb userspace or other kernel component" policy. An example
of this can be found here [2].
Along with several cleanups, this code refactors and changes the way soft
offline work. Main point of this change set is to contain target page
"via buddy allocator" or in migrating path. For ther former we first free
the target page as we do for normal pages, and once it has reached buddy
and it has been taken off the freelists, we flag it as HWpoison. For the
latter we never get to release the page in unmap_and_move, so the page is
under our control and we can handle it in hwpoison code.
[1] https://patchwork.kernel.org/cover/11704083/
[2] https://lore.kernel.org/linux-mm/20190826104144.GA7849@linux/T/#u
This patch (of 14):
Drop the PageHuge check, which is dead code since memory_failure() forks
into memory_failure_hugetlb() for hugetlb pages.
memory_failure() and memory_failure_hugetlb() shares some functions like
hwpoison_user_mappings() and identify_page_state(), so they should
properly handle 4kB page, thp, and hugetlb.
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com >
Signed-off-by: Oscar Salvador <osalvador@suse.de >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Michal Hocko <mhocko@kernel.org >
Cc: Tony Luck <tony.luck@intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com >
Cc: Dmitry Yakunin <zeil@yandex-team.ru >
Cc: Qian Cai <cai@lca.pw >
Cc: Dave Hansen <dave.hansen@intel.com >
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com >
Cc: Aristeu Rozanski <aris@ruivo.org >
Cc: Oscar Salvador <osalvador@suse.com >
Link: https://lkml.kernel.org/r/20200922135650.1634-1-osalvador@suse.de
Link: https://lkml.kernel.org/r/20200922135650.1634-2-osalvador@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
David Howells
b1647dc0de
mm/readahead: pass a file_ra_state into force_page_cache_ra
...
The file_ra_state being passed into page_cache_sync_readahead() was being
ignored in favour of using the one embedded in the struct file. The only
caller for which this makes a difference is the fsverity code if the file
has been marked as POSIX_FADV_RANDOM, but it's confusing and worth fixing.
Signed-off-by: David Howells <dhowells@redhat.com >
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Eric Biggers <ebiggers@google.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-10-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
David Howells
db660d4625
mm/filemap: fold ra_submit into do_sync_mmap_readahead
...
Fold ra_submit() into its last remaining user and pass the
readahead_control struct to both do_page_cache_ra() and
page_cache_sync_ra().
Signed-off-by: David Howells <dhowells@redhat.com >
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Eric Biggers <ebiggers@google.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-9-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
fefa7c478f
mm/readahead: add page_cache_sync_ra and page_cache_async_ra
...
Reimplement page_cache_sync_readahead() and page_cache_async_readahead()
as wrappers around versions of the function which take a readahead_control
in preparation for making do_sync_mmap_readahead() pass down an RAC
struct.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: David Howells <dhowells@redhat.com >
Cc: Eric Biggers <ebiggers@google.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-8-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
David Howells
7b3df3b9ac
mm/readahead: pass readahead_control to force_page_cache_ra
...
Reimplement force_page_cache_readahead() as a wrapper around
force_page_cache_ra(). Pass the existing readahead_control from
page_cache_sync_readahead().
Signed-off-by: David Howells <dhowells@redhat.com >
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Eric Biggers <ebiggers@google.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-7-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
David Howells
6e4af69ae9
mm/readahead: make ondemand_readahead take a readahead_control
...
Make ondemand_readahead() take a readahead_control struct in preparation
for making do_sync_mmap_readahead() pass down an RAC struct.
Signed-off-by: David Howells <dhowells@redhat.com >
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Eric Biggers <ebiggers@google.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
8238287ead
mm/readahead: make do_page_cache_ra take a readahead_control
...
Rename __do_page_cache_readahead() to do_page_cache_ra() and call it
directly from ondemand_readahead() instead of indirecting via ra_submit().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: David Howells <dhowells@redhat.com >
Cc: Eric Biggers <ebiggers@google.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-5-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
73bb49da50
mm/readahead: make page_cache_ra_unbounded take a readahead_control
...
Define it in the callers instead of in page_cache_ra_unbounded().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: David Howells <dhowells@redhat.com >
Cc: Eric Biggers <ebiggers@google.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-4-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:16 -07:00
Matthew Wilcox (Oracle)
1aa83cfa5a
mm/readahead: add DEFINE_READAHEAD
...
Patch series "Readahead patches for 5.9/5.10".
These are infrastructure for both the THP patchset and for the fscache
rewrite,
For both pieces of infrastructure being build on top of this patchset, we
want the ractl to be available higher in the call-stack.
For David's work, he wants to add the 'critical page' to the ractl so that
he knows which page NEEDS to be brought in from storage, and which ones
are nice-to-have. We might want something similar in block storage too.
It used to be simple -- the first page was the critical one, but then mmap
added fault-around and so for that usecase, the middle page is the
critical one. Anyway, I don't have any code to show that yet, we just
know that the lowest point in the callchain where we have that information
is do_sync_mmap_readahead() and so the ractl needs to start its life
there.
For THP, we havew the code that needs it. It's actually the apex patch to
the series; the one which finally starts to allocate THPs and present them
to consenting filesystems:
798bcf30ab
This patch (of 8):
Allow for a more concise definition of a struct readahead_control.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Eric Biggers <ebiggers@google.com >
Cc: David Howells <dhowells@redhat.com >
Link: https://lkml.kernel.org/r/20200903140844.14194-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20200903140844.14194-3-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Huang Ying
c4f9c701f9
mm: fix a race during THP splitting
...
It is reported that the following bug is triggered if the HDD is used as
swap device,
[ 5758.157556] BUG: kernel NULL pointer dereference, address: 0000000000000007
[ 5758.165331] #PF: supervisor write access in kernel mode
[ 5758.171161] #PF: error_code(0x0002) - not-present page
[ 5758.176894] PGD 0 P4D 0
[ 5758.179721] Oops: 0002 [#1 ] SMP PTI
[ 5758.183614] CPU: 10 PID: 316 Comm: kswapd1 Kdump: loaded Tainted: G S --------- --- 5.9.0-0.rc3.1.tst.el8.x86_64 #1
[ 5758.196717] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013
[ 5758.208176] RIP: 0010:split_swap_cluster+0x47/0x60
[ 5758.213522] Code: c1 e3 06 48 c1 eb 0f 48 8d 1c d8 48 89 df e8 d0 20 6a 00 80 63 07 fb 48 85 db 74 16 48 89 df c6 07 00 66 66 66 90 31 c0 5b c3 <80> 24 25 07 00 00 00 fb 31 c0 5b c3 b8 f0 ff ff ff 5b c3 66 0f 1f
[ 5758.234478] RSP: 0018:ffffb147442d7af0 EFLAGS: 00010246
[ 5758.240309] RAX: 0000000000000000 RBX: 000000000014b217 RCX: ffffb14779fd9000
[ 5758.248281] RDX: 000000000014b217 RSI: ffff9c52f2ab1400 RDI: 000000000014b217
[ 5758.256246] RBP: ffffe00c51168080 R08: ffffe00c5116fe08 R09: ffff9c52fffd3000
[ 5758.264208] R10: ffffe00c511537c8 R11: ffff9c52fffd3c90 R12: 0000000000000000
[ 5758.272172] R13: ffffe00c51170000 R14: ffffe00c51170000 R15: ffffe00c51168040
[ 5758.280134] FS: 0000000000000000(0000) GS:ffff9c52f2a80000(0000) knlGS:0000000000000000
[ 5758.289163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5758.295575] CR2: 0000000000000007 CR3: 0000000022a0e003 CR4: 00000000000606e0
[ 5758.303538] Call Trace:
[ 5758.306273] split_huge_page_to_list+0x88b/0x950
[ 5758.311433] deferred_split_scan+0x1ca/0x310
[ 5758.316202] do_shrink_slab+0x12c/0x2a0
[ 5758.320491] shrink_slab+0x20f/0x2c0
[ 5758.324482] shrink_node+0x240/0x6c0
[ 5758.328469] balance_pgdat+0x2d1/0x550
[ 5758.332652] kswapd+0x201/0x3c0
[ 5758.336157] ? finish_wait+0x80/0x80
[ 5758.340147] ? balance_pgdat+0x550/0x550
[ 5758.344525] kthread+0x114/0x130
[ 5758.348126] ? kthread_park+0x80/0x80
[ 5758.352214] ret_from_fork+0x22/0x30
[ 5758.356203] Modules linked in: fuse zram rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mgag200 iTCO_wdt crct10dif_pclmul iTCO_vendor_support drm_kms_helper crc32_pclmul ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops cec rapl joydev intel_cstate ipmi_si ipmi_devintf drm intel_uncore i2c_i801 ipmi_msghandler pcspkr lpc_ich mei_me i2c_smbus mei ioatdma ip_tables xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg igb ahci libahci i2c_algo_bit crc32c_intel libata dca wmi dm_mirror dm_region_hash dm_log dm_mod
[ 5758.412673] CR2: 0000000000000007
[ 0.000000] Linux version 5.9.0-0.rc3.1.tst.el8.x86_64 (mockbuild@x86-vm-15.build.eng.bos.redhat.com ) (gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), GNU ld version 2.30-79.el8) #1 SMP Wed Sep 9 16:03:34 EDT 2020
After further digging it's found that the following race condition exists in the
original implementation,
CPU1 CPU2
---- ----
deferred_split_scan()
split_huge_page(page) /* page isn't compound head */
split_huge_page_to_list(page, NULL)
__split_huge_page(page, )
ClearPageCompound(head)
/* unlock all subpages except page (not head) */
add_to_swap(head) /* not THP */
get_swap_page(head)
add_to_swap_cache(head, )
SetPageSwapCache(head)
if PageSwapCache(head)
split_swap_cluster(/* swap entry of head */)
/* Deref sis->cluster_info: NULL accessing! */
So, in split_huge_page_to_list(), PageSwapCache() is called for the already
split and unlocked "head", which may be added to swap cache in another CPU. So
split_swap_cluster() may be called wrongly.
To fix the race, the call to split_swap_cluster() is moved to
__split_huge_page() before all subpages are unlocked. So that the
PageSwapCache() is stable.
Fixes: 59807685a7
("mm, THP, swap: support splitting THP for THP swap out")
Reported-by: Rafael Aquini <aquini@redhat.com >
Signed-off-by: "Huang, Ying" <ying.huang@intel.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Tested-by: Rafael Aquini <aquini@redhat.com >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Cc: Hugh Dickins <hughd@google.com >
Cc: Andrea Arcangeli <aarcange@redhat.com >
Cc: Matthew Wilcox <willy@infradead.org >
Link: https://lkml.kernel.org/r/20201009073647.1531083-1-ying.huang@intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
01c7026705
fs: add a filesystem flag for THPs
...
The page cache needs to know whether the filesystem supports THPs so that
it doesn't send THPs to filesystems which can't handle them. Dave Chinner
points out that getting from the page mapping to the filesystem type is
too many steps (mapping->host->i_sb->s_type->fs_flags) so cache that
information in the address space flags.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Alexander Viro <viro@zeniv.linux.org.uk >
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org >
Cc: Hugh Dickins <hughd@google.com >
Cc: Song Liu <songliubraving@fb.com >
Cc: Rik van Riel <riel@surriel.com >
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Cc: Dave Chinner <dchinner@redhat.com >
Cc: Christoph Hellwig <hch@infradead.org >
Link: https://lkml.kernel.org/r/20200916032717.22917-1-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
3efe62e466
mm/vmscan: allow arbitrary sized pages to be paged out
...
Remove the assumption that a compound page has HPAGE_PMD_NR pins from the
page cache.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Acked-by: "Huang, Ying" <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-12-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
8854a6a724
mm/page-writeback: support tail pages in wait_for_stable_page
...
page->mapping is undefined for tail pages, so operate exclusively on the
head page.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Cc: Huang Ying <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-11-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
fc3a5ac528
mm/truncate: fix truncation for pages of arbitrary size
...
Remove the assumption that a compound page is HPAGE_PMD_SIZE, and the
assumption that any page is PAGE_SIZE.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Cc: Huang Ying <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-10-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
5eaf35ab12
mm/rmap: fix assumptions of THP size
...
Ask the page what size it is instead of assuming it's PMD size. Do this
for anon pages as well as file pages for when someone decides to support
that. Leave the assumption alone for pages which are PMD mapped; we don't
currently grow THPs beyond PMD size, so we don't need to change this code
yet.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Cc: Huang Ying <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-9-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
e2333dad2d
mm/huge_memory: fix can_split_huge_page assumption of THP size
...
Ask the page how many subpages it has instead of assuming it's PMD size.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Acked-by: "Huang, Ying" <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-8-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
65dfe3c3bc
mm/huge_memory: fix page_trans_huge_mapcount assumption of THP size
...
Ask the page what size it is instead of assuming it's PMD size.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Cc: Huang Ying <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-7-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Kirill A. Shutemov
8cce547568
mm/huge_memory: fix split assumption of page size
...
File THPs may now be of arbitrary size, and we can't rely on that size
after doing the split so remember the number of pages before we start the
split.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name >
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Cc: Huang Ying <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-6-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Kirill A. Shutemov
86b562b629
mm/huge_memory: fix total_mapcount assumption of page size
...
File THPs may now be of arbitrary order.
Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name >
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Cc: Huang Ying <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-5-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
8fb156c9ee
mm/page_owner: change split_page_owner to take a count
...
The implementation of split_page_owner() prefers a count rather than the
old order of the page. When we support a variable size THP, we won't
have the order at this point, but we will have the number of pages.
So change the interface to what the caller and callee would prefer.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Reviewed-by: SeongJae Park <sjpark@amazon.de >
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Cc: Huang Ying <ying.huang@intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-4-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00
Matthew Wilcox (Oracle)
d01ac3c352
mm/memory: remove page fault assumption of compound page size
...
A compound page in the page cache will not necessarily be of PMD size,
so check explicitly.
[willy@infradead.org: fix remove page fault assumption of compound page size]
Link: https://lkml.kernel.org/r/20201001152259.14932-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
Cc: Huang Ying <ying.huang@intel.com >
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com >
Link: https://lkml.kernel.org/r/20200908195539.25896-3-willy@infradead.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org >
2020-10-16 11:11:15 -07:00