123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209 |
- .. SPDX-License-Identifier: GPL-2.0
- .. iommu:
- =====================================
- IOMMU Userspace API
- =====================================
- IOMMU UAPI is used for virtualization cases where communications are
- needed between physical and virtual IOMMU drivers. For baremetal
- usage, the IOMMU is a system device which does not need to communicate
- with userspace directly.
- The primary use cases are guest Shared Virtual Address (SVA) and
- guest IO virtual address (IOVA), wherein the vIOMMU implementation
- relies on the physical IOMMU and for this reason requires interactions
- with the host driver.
- .. contents:: :local:
- Functionalities
- ===============
- Communications of user and kernel involve both directions. The
- supported user-kernel APIs are as follows:
- 1. Bind/Unbind guest PASID (e.g. Intel VT-d)
- 2. Bind/Unbind guest PASID table (e.g. ARM SMMU)
- 3. Invalidate IOMMU caches upon guest requests
- 4. Report errors to the guest and serve page requests
- Requirements
- ============
- The IOMMU UAPIs are generic and extensible to meet the following
- requirements:
- 1. Emulated and para-virtualised vIOMMUs
- 2. Multiple vendors (Intel VT-d, ARM SMMU, etc.)
- 3. Extensions to the UAPI shall not break existing userspace
- Interfaces
- ==========
- Although the data structures defined in IOMMU UAPI are self-contained,
- there are no user API functions introduced. Instead, IOMMU UAPI is
- designed to work with existing user driver frameworks such as VFIO.
- Extension Rules & Precautions
- When IOMMU UAPI gets extended, the data structures can *only* be
- modified in two ways:
- 1. Adding new fields by re-purposing the padding[] field. No size change.
- 2. Adding new union members at the end. May increase the structure sizes.
- No new fields can be added *after* the variable sized union in that it
- will break backward compatibility when offset moves. A new flag must
- be introduced whenever a change affects the structure using either
- method. The IOMMU driver processes the data based on flags which
- ensures backward compatibility.
- Version field is only reserved for the unlikely event of UAPI upgrade
- at its entirety.
- It's *always* the caller's responsibility to indicate the size of the
- structure passed by setting argsz appropriately.
- Though at the same time, argsz is user provided data which is not
- trusted. The argsz field allows the user app to indicate how much data
- it is providing; it's still the kernel's responsibility to validate
- whether it's correct and sufficient for the requested operation.
- Compatibility Checking
- When IOMMU UAPI extension results in some structure size increase,
- IOMMU UAPI code shall handle the following cases:
- 1. User and kernel has exact size match
- 2. An older user with older kernel header (smaller UAPI size) running on a
- newer kernel (larger UAPI size)
- 3. A newer user with newer kernel header (larger UAPI size) running
- on an older kernel.
- 4. A malicious/misbehaving user passing illegal/invalid size but within
- range. The data may contain garbage.
- Feature Checking
- While launching a guest with vIOMMU, it is strongly advised to check
- the compatibility upfront, as some subsequent errors happening during
- vIOMMU operation, such as cache invalidation failures cannot be nicely
- escalated to the guest due to IOMMU specifications. This can lead to
- catastrophic failures for the users.
- User applications such as QEMU are expected to import kernel UAPI
- headers. Backward compatibility is supported per feature flags.
- For example, an older QEMU (with older kernel header) can run on newer
- kernel. Newer QEMU (with new kernel header) may refuse to initialize
- on an older kernel if new feature flags are not supported by older
- kernel. Simply recompiling existing code with newer kernel header should
- not be an issue in that only existing flags are used.
- IOMMU vendor driver should report the below features to IOMMU UAPI
- consumers (e.g. via VFIO).
- 1. IOMMU_NESTING_FEAT_SYSWIDE_PASID
- 2. IOMMU_NESTING_FEAT_BIND_PGTBL
- 3. IOMMU_NESTING_FEAT_BIND_PASID_TABLE
- 4. IOMMU_NESTING_FEAT_CACHE_INVLD
- 5. IOMMU_NESTING_FEAT_PAGE_REQUEST
- Take VFIO as example, upon request from VFIO userspace (e.g. QEMU),
- VFIO kernel code shall query IOMMU vendor driver for the support of
- the above features. Query result can then be reported back to the
- userspace caller. Details can be found in
- Documentation/driver-api/vfio.rst.
- Data Passing Example with VFIO
- As the ubiquitous userspace driver framework, VFIO is already IOMMU
- aware and shares many key concepts such as device model, group, and
- protection domain. Other user driver frameworks can also be extended
- to support IOMMU UAPI but it is outside the scope of this document.
- In this tight-knit VFIO-IOMMU interface, the ultimate consumer of the
- IOMMU UAPI data is the host IOMMU driver. VFIO facilitates user-kernel
- transport, capability checking, security, and life cycle management of
- process address space ID (PASID).
- VFIO layer conveys the data structures down to the IOMMU driver. It
- follows the pattern below::
- struct {
- __u32 argsz;
- __u32 flags;
- __u8 data[];
- };
- Here data[] contains the IOMMU UAPI data structures. VFIO has the
- freedom to bundle the data as well as parse data size based on its own flags.
- In order to determine the size and feature set of the user data, argsz
- and flags (or the equivalent) are also embedded in the IOMMU UAPI data
- structures.
- A "__u32 argsz" field is *always* at the beginning of each structure.
- For example:
- ::
- struct iommu_cache_invalidate_info {
- __u32 argsz;
-
- __u32 version;
- /* IOMMU paging structure cache */
-
-
-
-
- __u8 cache;
- __u8 granularity;
- __u8 padding[6];
- union {
- struct iommu_inv_pasid_info pasid_info;
- struct iommu_inv_addr_info addr_info;
- } granu;
- };
- VFIO is responsible for checking its own argsz and flags. It then
- invokes appropriate IOMMU UAPI functions. The user pointers are passed
- to the IOMMU layer for further processing. The responsibilities are
- divided as follows:
- - Generic IOMMU layer checks argsz range based on UAPI data in the
- current kernel version.
- - Generic IOMMU layer checks content of the UAPI data for non-zero
- reserved bits in flags, padding fields, and unsupported version.
- This is to ensure not breaking userspace in the future when these
- fields or flags are used.
- - Vendor IOMMU driver checks argsz based on vendor flags. UAPI data
- is consumed based on flags. Vendor driver has access to
- unadulterated argsz value in case of vendor specific future
- extensions. Currently, it does not perform the copy_from_user()
- itself. A __user pointer can be provided in some future scenarios
- where there's vendor data outside of the structure definition.
- IOMMU code treats UAPI data in two categories:
- - structure contains vendor data
- (Example: iommu_uapi_cache_invalidate())
- - structure contains only generic data
- (Example: iommu_uapi_sva_bind_gpasid())
- Sharing UAPI with in-kernel users
- For UAPIs that are shared with in-kernel users, a wrapper function is
- provided to distinguish the callers. For example,
- Userspace caller ::
- int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain,
- struct device *dev,
- void __user *udata)
- In-kernel caller ::
- int iommu_sva_unbind_gpasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t ioasid);
|