Pull IOMMU updates from Joerg Roedel:
"This time with bigger changes than usual:
- A new IOMMU driver for the ARM SMMUv3.
This IOMMU is pretty different from SMMUv1 and v2 in that it is
configured through in-memory structures and not through the MMIO
register region. The ARM SMMUv3 also supports IO demand paging for
PCI devices with PRI/PASID capabilities, but this is not
implemented in the driver yet.
- Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
- Introduction of default domains into the IOMMU core code.
The rationale behind this is to move functionalily out of the IOMMU
drivers to common code to get to a unified behavior between
different drivers. The patches here introduce a default domain for
iommu-groups (isolation groups).
A device will now always be attached to a domain, either the
default domain or another domain handled by the device driver. The
IOMMU drivers have to be modified to make use of that feature. So
long the AMD IOMMU driver is converted, with others to follow.
- Patches for the Intel VT-d drvier to fix DMAR faults that happen
when a kdump kernel boots.
When the kdump kernel boots it re-initializes the IOMMU hardware,
which destroys all mappings from the crashed kernel. As this
happens before the endpoint devices are re-initialized, any
in-flight DMA causes a DMAR fault. These faults cause PCI master
aborts, which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old kernel
and keep the old mappings in place until the device driver of the
new kernel takes over. This emulates the the behavior without an
IOMMU to the best degree possible.
- A couple of other small fixes and cleanups"
* tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (69 commits)
iommu/amd: Handle large pages correctly in free_pagetable
iommu/vt-d: Don't disable IR when it was previously enabled
iommu/vt-d: Make sure copied over IR entries are not reused
iommu/vt-d: Copy IR table from old kernel when in kdump mode
iommu/vt-d: Set IRTA in intel_setup_irq_remapping
iommu/vt-d: Disable IRQ remapping in intel_prepare_irq_remapping
iommu/vt-d: Move QI initializationt to intel_setup_irq_remapping
iommu/vt-d: Move EIM detection to intel_prepare_irq_remapping
iommu/vt-d: Enable Translation only if it was previously disabled
iommu/vt-d: Don't disable translation prior to OS handover
iommu/vt-d: Don't copy translation tables if RTT bit needs to be changed
iommu/vt-d: Don't do early domain assignment if kdump kernel
iommu/vt-d: Allocate si_domain in init_dmars()
iommu/vt-d: Mark copied context entries
iommu/vt-d: Do not re-use domain-ids from the old kernel
iommu/vt-d: Copy translation tables from old kernel
iommu/vt-d: Detect pre enabled translation
iommu/vt-d: Make root entry visible for hardware right after allocation
iommu/vt-d: Init QI before root entry is allocated
iommu/vt-d: Cleanup log messages
...
Keep it enabled in kdump kernel to guarantee interrupt
delivery.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Walk over the copied entries and mark the present ones as
allocated.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When we are booting into a kdump kernel and find IR enabled,
copy over the contents of the previous IR table so that
spurious interrupts will not be target aborted.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This way we can give the hardware the new IR table right
after it has been allocated and initialized.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Move it to this function for now, so that the copy routines
for irq remapping take no effect yet.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
QI needs to be enabled when we program the irq remapping
table to hardware in the prepare phase later.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We need this to be detected already when we program the irq
remapping table pointer to hardware.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Give them a common prefix that can be grepped for and
improve the wording here and there.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull intel iommu updates from David Woodhouse:
"This lays a little of the groundwork for upcoming Shared Virtual
Memory support — fixing some bogus #defines for capability bits and
adding the new ones, and starting to use the new wider page tables
where we can, in anticipation of actually filling in the new fields
therein.
It also allows graphics devices to be assigned to VM guests again.
This got broken in 3.17 by disallowing assignment of RMRR-afflicted
devices. Like USB, we do understand why there's an RMRR for graphics
devices — and unlike USB, it's actually sane. So we can make an
exception for graphics devices, just as we do USB controllers.
Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to
persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty
hack to allow broken BIOSes to forbid us from using X2APIC when they
do stupid and invasive things and would break if we did.
Someone noticed that since Windows doesn't have full IOMMU support for
DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid
initialising the IOMMU on the graphics unit altogether.
This means that it would be available for use in "driver mode", where
the IOMMU registers are made available through a BAR of the graphics
device and the graphics driver can do SVM all for itself.
So they started setting the X2APIC_OPT_OUT bit on *all* platforms with
SVM capabilities. And even the platforms which *might*, if the
planets had been aligned correctly, possibly have had SVM capability
but which in practice actually don't"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: support extended root and context entries
iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification
iommu/vt-d: Allow RMRR on graphics devices too
iommu/vt-d: Print x2apic opt out info instead of printing a warning
iommu/vt-d: kill bogus ecap_niotlb_iunits()
BIOS can set up x2apic_opt_out bit on some platforms, for various misguided
reasons like insane SMM code with weird assumptions about what descriptors
look like, or wanting Windows not to enable the IOMMU so that the graphics
driver will take it over for SVM in "driver mode".
A user can either disable the x2apic_opt_out bit in BIOS or by kernel
parameter "no_x2apic_optout". Instead of printing a warning, we just
print information of x2apic opt out.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Currently if CPU supports X2APIC, IR hardware must work in X2APIC mode
or disabled. Change the code to support IR working in XAPIC mode when
CPU supports X2APIC. Then the CPU APIC driver will decide how to handle
such as configuration by:
1) Disabling X2APIC mode
2) Forcing X2APIC physical mode
This change also fixes a live locking when
1) BIOS enables CPU X2APIC
2) DMAR table disables X2APIC mode or IR hardware doesn't support X2APIC
with following messages:
[ 37.863463] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2
[ 37.863463] INTR-REMAP:[fault reason 36] Detected reserved fields in the IRTE entry
[ 37.879372] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2
[ 37.879372] INTR-REMAP:[fault reason 36] Detected reserved fields in the IRTE entry
[ 37.895282] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2
[ 37.895282] INTR-REMAP:[fault reason 36] Detected reserved fields in the IRTE entry
[ 37.911192] dmar: INTR-REMAP: Request device [[f0:1f.7] fault index 2
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Tested-by: Joerg Roedel <joro@8bytes.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/1420615903-28253-11-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The whole iommu setup for irq remapping is a convoluted mess. The
iommu detect function gets called from mem_init() and the prepare
callback gets called from enable_IR_x2apic() for unknown reasons.
Of course AMD and Intel setup differs in nonsensical ways. Intels
prepare callback is explicit while AMDs prepare callback is implicit
in setup_irq_remapping_ops() just to be called in the prepare call
again.
Because all of this gets called from enable_IR_x2apic() and the dmar
prepare function merily parses the ACPI tables, but does not allocate
memory we end up with memory allocation from irq disabled context
later on.
AMDs iommu code at least allocates the required memory from the
prepare function. That has issues as well, but thats not scope of this
patch.
The goal of this change is to distangle the allocation from the actual
enablement. There is no point to allocate memory from irq disabled
regions with GFP_ATOMIC just because it does not matter at that point
in the boot stage. It matters with physical hotplug later on.
There is another issue with the current setup. Due to the conversion
to stacked irqdomains we end up with a call into the irqdomain
allocation code from irq disabled context, but that code does
GFP_KERNEL allocations rightfully as there is no reason to do
preperatory allocations with GFP_ATOMIC.
That change caused the allocator code to complain about GFP_KERNEL
allocations invoked in atomic context. Boris provided a temporary
hackaround which changed the GFP flags if irq_domain_add() got called
from atomic context. Not pretty and we really dont want to get this
into a mainline release for obvious reasons.
Move the ACPI table parsing and the resulting memory allocations from
the enable to the prepare function. That allows to get rid of the
horrible hackaround in irq_domain_add() later.
[Jiang] Rebased onto v3.19
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Borislav Petkov <bp@alien8.de>
Acked-and-tested-by: Joerg Roedel <joro@8bytes.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: iommu@lists.linux-foundation.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20141205084147.313026156@linutronix.de
Link: http://lkml.kernel.org/r/1420615903-28253-3-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Enhance error recovery in function intel_enable_irq_remapping()
by tearing down all created data structures.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Implement required callback functions for intel_irq_remapping driver
to support DMAR unit hotplug.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
On Intel platforms, an IO Hub (PCI/PCIe host bridge) may contain DMAR
units, so we need to support DMAR hotplug when supporting PCI host
bridge hotplug on Intel platforms.
According to Section 8.8 "Remapping Hardware Unit Hot Plug" in "Intel
Virtualization Technology for Directed IO Architecture Specification
Rev 2.2", ACPI BIOS should implement ACPI _DSM method under the ACPI
object for the PCI host bridge to support DMAR hotplug.
This patch introduces interfaces to parse ACPI _DSM method for
DMAR unit hotplug. It also implements state machines for DMAR unit
hot-addition and hot-removal.
The PCI host bridge hotplug driver should call dmar_hotplug_hotplug()
before scanning PCI devices connected for hot-addition and after
destroying all PCI devices for hot-removal.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 71054d8841 ("x86, hpet: Introduce x86_msi_ops.setup_hpet_msi")
introduced x86_msi_ops.setup_hpet_msi to setup hpet MSI irq
when irq remapping enabled. This caused a regression of
hpet MSI irq remapping.
Original code flow before commit 71054d8841:
hpet_setup_msi_irq()
arch_setup_hpet_msi()
setup_hpet_msi_remapped()
remap_ops->setup_hpet_msi()
alloc_irte()
msi_compose_msg()
hpet_msi_write()
...
Current code flow after commit 71054d8841:
hpet_setup_msi_irq()
x86_msi.setup_hpet_msi()
setup_hpet_msi_remapped()
intel_setup_hpet_msi()
alloc_irte()
Currently, we only call alloc_irte() for hpet MSI, but
do not composed and wrote its msg...
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Don't store the SIRTP request bit in the register state. It will
otherwise become sticky and could request an Interrupt Remap Table
Pointer update on each command register write.
Found while starting to emulate IR in QEMU, not by observing problems on
real hardware.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
VT-d code currently makes use of pci_find_upstream_pcie_bridge() in
order to find the topology based alias of a device. This function has
a few problems. First, it doesn't check the entire alias path of the
device to the root bus, therefore if a PCIe device is masked upstream,
the wrong result is produced. Also, it's known to get confused and
give up when it crosses a bridge from a conventional PCI bus to a PCIe
bus that lacks a PCIe capability. The PCI-core provided DMA alias
support solves both of these problems and additionally adds support
for DMA function quirks allowing VT-d to work with devices like
Marvell and Ricoh with known broken requester IDs.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Introduce a global rwsem dmar_global_lock, which will be used to
protect DMAR related global data structures from DMAR/PCI/memory
device hotplug operations in process context.
DMA and interrupt remapping related data structures are read most,
and only change when memory/PCI/DMAR hotplug event happens.
So a global rwsem solution is adopted for balance between simplicity
and performance.
For interrupt remapping driver, function intel_irq_remapping_supported(),
dmar_table_init(), intel_enable_irq_remapping(), disable_irq_remapping(),
reenable_irq_remapping() and enable_drhd_fault_handling() etc
are called during booting, suspending and resuming with interrupt
disabled, so no need to take the global lock.
For interrupt remapping entry allocation, the locking model is:
down_read(&dmar_global_lock);
/* Find corresponding iommu */
iommu = map_hpet_to_ir(id);
if (iommu)
/*
* Allocate remapping entry and mark entry busy,
* the IOMMU won't be hot-removed until the
* allocated entry has been released.
*/
index = alloc_irte(iommu, irq, 1);
up_read(&dmar_global_lock);
For DMA remmaping driver, we only uses the dmar_global_lock rwsem to
protect functions which are only called in process context. For any
function which may be called in interrupt context, we will use RCU
to protect them in following patches.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Pull IOMMU Updates from Joerg Roedel:
"A few patches have been queued up for this merge window:
- improvements for the ARM-SMMU driver (IOMMU_EXEC support, IOMMU
group support)
- updates and fixes for the shmobile IOMMU driver
- various fixes to generic IOMMU code and the Intel IOMMU driver
- some cleanups in IOMMU drivers (dev_is_pci() usage)"
* tag 'iommu-updates-v3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (36 commits)
iommu/vt-d: Fix signedness bug in alloc_irte()
iommu/vt-d: free all resources if failed to initialize DMARs
iommu/vt-d, trivial: clean sparse warnings
iommu/vt-d: fix wrong return value of dmar_table_init()
iommu/vt-d: release invalidation queue when destroying IOMMU unit
iommu/vt-d: fix access after free issue in function free_dmar_iommu()
iommu/vt-d: keep shared resources when failed to initialize iommu devices
iommu/vt-d: fix invalid memory access when freeing DMAR irq
iommu/vt-d, trivial: simplify code with existing macros
iommu/vt-d, trivial: use defined macro instead of hardcoding
iommu/vt-d: mark internal functions as static
iommu/vt-d, trivial: clean up unused code
iommu/vt-d, trivial: check suitable flag in function detect_intel_iommu()
iommu/vt-d, trivial: print correct domain id of static identity domain
iommu/vt-d, trivial: refine support of 64bit guest address
iommu/vt-d: fix resource leakage on error recovery path in iommu_init_domains()
iommu/vt-d: fix a race window in allocating domain ID for virtual machines
iommu/vt-d: fix PCI device reference leakage on error recovery path
drm/msm: Fix link error with !MSM_IOMMU
iommu/vt-d: use dedicated bitmap to track remapping entry allocation status
...
"index" needs to be signed for the error handling to work. I deleted a
little bit of obsolete cruft related to "index" and "start_index" as
well.
Fixes: 360eb3c568 ('iommu/vt-d: use dedicated bitmap to track remapping entry allocation status')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Simplify vt-d related code with existing macros and introduce a new
macro for_each_active_drhd_unit() to enumerate all active DRHD unit.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Use defined macro instead of hardcoding in function set_ioapic_sid()
for readability.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>