mm: turn migrate_vma upside down

There isn't any good reason to pass callbacks to migrate_vma.  Instead
we can just export the three steps done by this function to drivers and
let them sequence the operation without callbacks.  This removes a lot
of boilerplate code as-is, and will allow the drivers to drastically
improve code flow and error handling further on.

Link: https://lore.kernel.org/r/20190814075928.23766-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
This commit is contained in:
Christoph Hellwig
2019-08-14 09:59:19 +02:00
committed by Jason Gunthorpe
parent f4fb3b9c19
commit a7d1f22bb7
4 changed files with 193 additions and 343 deletions

View File

@@ -182,107 +182,27 @@ static inline unsigned long migrate_pfn(unsigned long pfn)
return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID;
}
/*
* struct migrate_vma_ops - migrate operation callback
*
* @alloc_and_copy: alloc destination memory and copy source memory to it
* @finalize_and_map: allow caller to map the successfully migrated pages
*
*
* The alloc_and_copy() callback happens once all source pages have been locked,
* unmapped and checked (checked whether pinned or not). All pages that can be
* migrated will have an entry in the src array set with the pfn value of the
* page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE flag set (other
* flags might be set but should be ignored by the callback).
*
* The alloc_and_copy() callback can then allocate destination memory and copy
* source memory to it for all those entries (ie with MIGRATE_PFN_VALID and
* MIGRATE_PFN_MIGRATE flag set). Once these are allocated and copied, the
* callback must update each corresponding entry in the dst array with the pfn
* value of the destination page and with the MIGRATE_PFN_VALID and
* MIGRATE_PFN_LOCKED flags set (destination pages must have their struct pages
* locked, via lock_page()).
*
* At this point the alloc_and_copy() callback is done and returns.
*
* Note that the callback does not have to migrate all the pages that are
* marked with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration
* from device memory to system memory (ie the MIGRATE_PFN_DEVICE flag is also
* set in the src array entry). If the device driver cannot migrate a device
* page back to system memory, then it must set the corresponding dst array
* entry to MIGRATE_PFN_ERROR. This will trigger a SIGBUS if CPU tries to
* access any of the virtual addresses originally backed by this page. Because
* a SIGBUS is such a severe result for the userspace process, the device
* driver should avoid setting MIGRATE_PFN_ERROR unless it is really in an
* unrecoverable state.
*
* For empty entry inside CPU page table (pte_none() or pmd_none() is true) we
* do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus
* allowing device driver to allocate device memory for those unback virtual
* address. For this the device driver simply have to allocate device memory
* and properly set the destination entry like for regular migration. Note that
* this can still fails and thus inside the device driver must check if the
* migration was successful for those entry inside the finalize_and_map()
* callback just like for regular migration.
*
* THE alloc_and_copy() CALLBACK MUST NOT CHANGE ANY OF THE SRC ARRAY ENTRIES
* OR BAD THINGS WILL HAPPEN !
*
*
* The finalize_and_map() callback happens after struct page migration from
* source to destination (destination struct pages are the struct pages for the
* memory allocated by the alloc_and_copy() callback). Migration can fail, and
* thus the finalize_and_map() allows the driver to inspect which pages were
* successfully migrated, and which were not. Successfully migrated pages will
* have the MIGRATE_PFN_MIGRATE flag set for their src array entry.
*
* It is safe to update device page table from within the finalize_and_map()
* callback because both destination and source page are still locked, and the
* mmap_sem is held in read mode (hence no one can unmap the range being
* migrated).
*
* Once callback is done cleaning up things and updating its page table (if it
* chose to do so, this is not an obligation) then it returns. At this point,
* the HMM core will finish up the final steps, and the migration is complete.
*
* THE finalize_and_map() CALLBACK MUST NOT CHANGE ANY OF THE SRC OR DST ARRAY
* ENTRIES OR BAD THINGS WILL HAPPEN !
*/
struct migrate_vma_ops {
void (*alloc_and_copy)(struct vm_area_struct *vma,
const unsigned long *src,
unsigned long *dst,
unsigned long start,
unsigned long end,
void *private);
void (*finalize_and_map)(struct vm_area_struct *vma,
const unsigned long *src,
const unsigned long *dst,
unsigned long start,
unsigned long end,
void *private);
struct migrate_vma {
struct vm_area_struct *vma;
/*
* Both src and dst array must be big enough for
* (end - start) >> PAGE_SHIFT entries.
*
* The src array must not be modified by the caller after
* migrate_vma_setup(), and must not change the dst array after
* migrate_vma_pages() returns.
*/
unsigned long *dst;
unsigned long *src;
unsigned long cpages;
unsigned long npages;
unsigned long start;
unsigned long end;
};
#if defined(CONFIG_MIGRATE_VMA_HELPER)
int migrate_vma(const struct migrate_vma_ops *ops,
struct vm_area_struct *vma,
unsigned long start,
unsigned long end,
unsigned long *src,
unsigned long *dst,
void *private);
#else
static inline int migrate_vma(const struct migrate_vma_ops *ops,
struct vm_area_struct *vma,
unsigned long start,
unsigned long end,
unsigned long *src,
unsigned long *dst,
void *private)
{
return -EINVAL;
}
#endif /* IS_ENABLED(CONFIG_MIGRATE_VMA_HELPER) */
int migrate_vma_setup(struct migrate_vma *args);
void migrate_vma_pages(struct migrate_vma *migrate);
void migrate_vma_finalize(struct migrate_vma *migrate);
#endif /* CONFIG_MIGRATION */