mm/devm_memremap_pages: fix final page put race

Logan noticed that devm_memremap_pages_release() kills the percpu_ref
drops all the page references that were acquired at init and then
immediately proceeds to unplug, arch_remove_memory(), the backing pages
for the pagemap.  If for some reason device shutdown actually collides
with a busy / elevated-ref-count page then arch_remove_memory() should
be deferred until after that reference is dropped.

As it stands the "wait for last page ref drop" happens *after*
devm_memremap_pages_release() returns, which is obviously too late and
can lead to crashes.

Fix this situation by assigning the responsibility to wait for the
percpu_ref to go idle to devm_memremap_pages() with a new ->cleanup()
callback.  Implement the new cleanup callback for all
devm_memremap_pages() users: pmem, devdax, hmm, and p2pdma.

Link: http://lkml.kernel.org/r/155727339156.292046.5432007428235387859.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: 41e94a8513 ("add devm_memremap_pages")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Dan Williams
2019-06-13 15:56:33 -07:00
committed by Linus Torvalds
parent 1570175abd
commit 50f44ee724
7 changed files with 38 additions and 44 deletions

View File

@@ -303,11 +303,19 @@ static const struct attribute_group *pmem_attribute_groups[] = {
NULL,
};
static void pmem_release_queue(void *q)
static void __pmem_release_queue(struct percpu_ref *ref)
{
struct request_queue *q;
q = container_of(ref, typeof(*q), q_usage_counter);
blk_cleanup_queue(q);
}
static void pmem_release_queue(void *ref)
{
__pmem_release_queue(ref);
}
static void pmem_freeze_queue(struct percpu_ref *ref)
{
struct request_queue *q;
@@ -399,12 +407,10 @@ static int pmem_attach_disk(struct device *dev,
if (!q)
return -ENOMEM;
if (devm_add_action_or_reset(dev, pmem_release_queue, q))
return -ENOMEM;
pmem->pfn_flags = PFN_DEV;
pmem->pgmap.ref = &q->q_usage_counter;
pmem->pgmap.kill = pmem_freeze_queue;
pmem->pgmap.cleanup = __pmem_release_queue;
if (is_nd_pfn(dev)) {
if (setup_pagemap_fsdax(dev, &pmem->pgmap))
return -ENOMEM;
@@ -425,6 +431,9 @@ static int pmem_attach_disk(struct device *dev,
pmem->pfn_flags |= PFN_MAP;
memcpy(&bb_res, &pmem->pgmap.res, sizeof(bb_res));
} else {
if (devm_add_action_or_reset(dev, pmem_release_queue,
&q->q_usage_counter))
return -ENOMEM;
addr = devm_memremap(dev, pmem->phys_addr,
pmem->size, ARCH_MEMREMAP_PMEM);
memcpy(&bb_res, &nsio->res, sizeof(bb_res));