KVM: PPC: Avoid marking DMA-mapped pages dirty in real mode
At the moment the real mode handler of H_PUT_TCE calls iommu_tce_xchg_rm() which in turn reads the old TCE and if it was a valid entry, marks the physical page dirty if it was mapped for writing. Since it is in real mode, realmode_pfn_to_page() is used instead of pfn_to_page() to get the page struct. However SetPageDirty() itself reads the compound page head and returns a virtual address for the head page struct and setting dirty bit for that kills the system. This adds additional dirty bit tracking into the MM/IOMMU API for use in the real mode. Note that this does not change how VFIO and KVM (in virtual mode) set this bit. The KVM (real mode) changes include: - use the lowest bit of the cached host phys address to carry the dirty bit; - mark pages dirty when they are unpinned which happens when the preregistered memory is released which always happens in virtual mode; - add mm_iommu_ua_mark_dirty_rm() helper to set delayed dirty bit; - change iommu_tce_xchg_rm() to take the kvm struct for the mm to use in the new mm_iommu_ua_mark_dirty_rm() helper; - move iommu_tce_xchg_rm() to book3s_64_vio_hv.c (which is the only caller anyway) to reduce the real mode KVM and IOMMU knowledge across different subsystems. This removes realmode_pfn_to_page() as it is not used anymore. While we at it, remove some EXPORT_SYMBOL_GPL() as that code is for the real mode only and modules cannot call it anyway. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This commit is contained in:

committed by
Paul Mackerras

父節點
bdf7ffc899
當前提交
425333bf3a
@@ -308,55 +308,6 @@ void register_page_bootmem_memmap(unsigned long section_nr,
|
||||
{
|
||||
}
|
||||
|
||||
/*
|
||||
* We do not have access to the sparsemem vmemmap, so we fallback to
|
||||
* walking the list of sparsemem blocks which we already maintain for
|
||||
* the sake of crashdump. In the long run, we might want to maintain
|
||||
* a tree if performance of that linear walk becomes a problem.
|
||||
*
|
||||
* realmode_pfn_to_page functions can fail due to:
|
||||
* 1) As real sparsemem blocks do not lay in RAM continously (they
|
||||
* are in virtual address space which is not available in the real mode),
|
||||
* the requested page struct can be split between blocks so get_page/put_page
|
||||
* may fail.
|
||||
* 2) When huge pages are used, the get_page/put_page API will fail
|
||||
* in real mode as the linked addresses in the page struct are virtual
|
||||
* too.
|
||||
*/
|
||||
struct page *realmode_pfn_to_page(unsigned long pfn)
|
||||
{
|
||||
struct vmemmap_backing *vmem_back;
|
||||
struct page *page;
|
||||
unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
|
||||
unsigned long pg_va = (unsigned long) pfn_to_page(pfn);
|
||||
|
||||
for (vmem_back = vmemmap_list; vmem_back; vmem_back = vmem_back->list) {
|
||||
if (pg_va < vmem_back->virt_addr)
|
||||
continue;
|
||||
|
||||
/* After vmemmap_list entry free is possible, need check all */
|
||||
if ((pg_va + sizeof(struct page)) <=
|
||||
(vmem_back->virt_addr + page_size)) {
|
||||
page = (struct page *) (vmem_back->phys + pg_va -
|
||||
vmem_back->virt_addr);
|
||||
return page;
|
||||
}
|
||||
}
|
||||
|
||||
/* Probably that page struct is split between real pages */
|
||||
return NULL;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(realmode_pfn_to_page);
|
||||
|
||||
#else
|
||||
|
||||
struct page *realmode_pfn_to_page(unsigned long pfn)
|
||||
{
|
||||
struct page *page = pfn_to_page(pfn);
|
||||
return page;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(realmode_pfn_to_page);
|
||||
|
||||
#endif /* CONFIG_SPARSEMEM_VMEMMAP */
|
||||
|
||||
#ifdef CONFIG_PPC_BOOK3S_64
|
||||
|
Reference in New Issue
Block a user