Commit Graph

27 Commits

Author SHA1 Message Date
Isaac J. Manjarres
ba575b2222 BACKPORT: FROMLIST: iommu/io-pgtable: Introduce map_pages() as a page table op
Mapping memory into io-pgtables follows the same semantics
that unmapping memory used to follow (i.e. a buffer will be
mapped one page block per call to the io-pgtable code). This
means that it can be optimized in the same way that unmapping
memory was, so add a map_pages() callback to the io-pgtable
ops structure, so that a range of pages of the same size
can be mapped within the same call.

Bug: 178537788
Change-Id: I5f2a86f21216f26b2cc2f70904c2968467c5363a
Link: https://lore.kernel.org/linux-iommu/20210408171402.12607-1-isaacm@codeaurora.org/T/#t
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Suggested-by: Will Deacon <will@kernel.org>
2021-04-09 21:09:05 -07:00
Isaac J. Manjarres
1e74a0fd95 FROMLIST: iommu/io-pgtable: Introduce unmap_pages() as a page table op
The io-pgtable code expects to operate on a single block or
granule of memory that is supported by the IOMMU hardware when
unmapping memory.

This means that when a large buffer that consists of multiple
such blocks is unmapped, the io-pgtable code will walk the page
tables to the correct level to unmap each block, even for blocks
that are virtually contiguous and at the same level, which can
incur an overhead in performance.

Introduce the unmap_pages() page table op to express to the
io-pgtable code that it should unmap a number of blocks of
the same size, instead of a single block. Doing so allows
multiple blocks to be unmapped in one call to the io-pgtable
code, reducing the number of page table walks, and indirect
calls.

Bug: 178537788
Link: https://lore.kernel.org/linux-iommu/20210408171402.12607-1-isaacm@codeaurora.org/T/#t
Change-Id: Ic3b3bf2f0e41eda4ca75482cd95a671a1f8e1cee
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
2021-04-09 21:09:05 -07:00
Yong Wu
04e8c94822 UPSTREAM: iommu/io-pgtable-arm-v7s: Extend PA34 for MediaTek
MediaTek extend the bit5 in lvl1 and lvl2 descriptor as PA34.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Link: https://lore.kernel.org/r/20210111111914.22211-11-yong.wu@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 40596d2f2b6075f6c33180b2f55c814ff4885475)

BUG=b:174513569

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Change-Id: I126769349c1eec5503338304981bdc1f4eae0787
2021-03-24 12:45:11 -07:00
Robin Murphy
e947ea97f7 UPSTREAM: iommu/io-pgtable: Remove TLBI_ON_MAP quirk
IO_PGTABLE_QUIRK_TLBI_ON_MAP is now fully superseded by the
core API's iotlb_sync_map callback.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/5abb80bba3a7c371d5ffb7e59c05586deddb9a91.1611764372.git.robin.murphy@arm.com
[will: Remove unused 'iop' local variable from arm_v7s_map()]
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 3d5eab41451f8e28f3e45eef8f6b372bf56612fb)

BUG=b:174513569

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Change-Id: I77af3446d897e3a0c998dc7c5aef19e6c41ef2ca
2021-03-24 12:45:10 -07:00
Yong Wu
f48d6310fe UPSTREAM: iommu/io-pgtable: Allow io_pgtable_tlb ops optional
This patch allows io_pgtable_tlb ops could be null since the IOMMU drivers
may use the tlb ops from iommu framework.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210107122909.16317-6-yong.wu@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 77e0992aee4e980e8c553e512a5dfa3e704cf030)

BUG=b:174513569

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Change-Id: Id0b401abf98ed11ef9f74ad874c1c1c7e7b070a8
2021-03-24 12:45:10 -07:00
Robin Murphy
389c1b3f17 UPSTREAM: iommu/io-pgtable: Remove tlb_flush_leaf
The only user of tlb_flush_leaf is a particularly hairy corner of the
Arm short-descriptor code, which wants a synchronous invalidation to
minimise the races inherent in trying to split a large page mapping.
This is already far enough into "here be dragons" territory that no
sensible caller should ever hit it, and thus it really doesn't need
optimising. Although using tlb_flush_walk there may technically be
more heavyweight than needed, it does the job and saves everyone else
having to carry around useless baggage.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/9844ab0c5cb3da8b2f89c6c2da16941910702b41.1606324115.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit fefe8527a1e0e0014946c6b5b3b2e40cb32bb5d3)

BUG=b:174513569

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Change-Id: I04128ee941e78b3aad53d7d227d8d825e9ee6fe6
2021-03-24 12:45:09 -07:00
Isaac J. Manjarres
f5333ac8be FROMLIST: iommu/io-pgtable: Introduce map_sg() as a page table op
While mapping a scatter-gather list, iommu_map_sg() calls
into the IOMMU driver through an indirect call, which can
call into the io-pgtable code through another indirect call.

This sequence of going through the IOMMU core code, the IOMMU
driver, and finally the io-pgtable code, occurs for every
element in the scatter-gather list, in the worse case, which
is not optimal.

Introduce a map_sg callback in the io-pgtable ops so that
IOMMU drivers can invoke it with the complete scatter-gather
list, so that it can be processed within the io-pgtable
code entirely, reducing the number of indirect calls, and
boosting overall iommu_map_sg() performance.

Bug: 176779203
Link: https://lore.kernel.org/linux-iommu/1610376862-927-1-git-send-email-isaacm@codeaurora.org/T/#t
Change-Id: I4b2088dd08eb97dcd94a6c6968082a3c4395351a
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
2021-01-12 14:55:16 +00:00
Tom Murphy
aae4c8e27b iommu: Rename iommu_tlb_* functions to iommu_iotlb_*
To keep naming consistent we should stick with *iotlb*. This patch
renames a few remaining functions.

Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Link: https://lore.kernel.org/r/20200817210051.13546-1-murphyt7@tcd.ie
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-09-04 11:16:09 +02:00
Baolin Wang
f34ce7a701 iommu: Add gfp parameter to io_pgtable_ops->map()
Now the ARM page tables are always allocated by GFP_ATOMIC parameter,
but the iommu_ops->map() function has been added a gfp_t parameter by
commit 781ca2de89 ("iommu: Add gfp parameter to iommu_ops::map"),
thus io_pgtable_ops->map() should use the gfp parameter passed from
iommu_ops->map() to allocate page pages, which can avoid wasting the
memory allocators atomic pools for some non-atomic contexts.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/3093df4cb95497aaf713fca623ce4ecebb197c2e.1591930156.git.baolin.wang@linux.alibaba.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-07-24 14:29:47 +02:00
Robin Murphy
db6903010a iommu/io-pgtable-arm: Prepare for TTBR1 usage
Now that we can correctly extract top-level indices without relying on
the remaining upper bits being zero, the only remaining impediments to
using a given table for TTBR1 are the address validation on map/unmap
and the awkward TCR translation granule format. Add a quirk so that we
can do the right thing at those points.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:25 +00:00
Will Deacon
ac4b80e5b9 iommu/io-pgtable-arm: Rationalise VTCR handling
Commit 05a648cd2dd7 ("iommu/io-pgtable-arm: Rationalise TCR handling")
reworked the way in which the TCR register value is returned from the
io-pgtable code when targetting the Arm long-descriptor format, in
preparation for allowing page-tables to target TTBR1.

As it turns out, the new interface is a lot nicer to use, so do the same
conversion for the VTCR register even though there is only a single base
register for stage-2 translation.

Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:25 +00:00
Robin Murphy
fb485eb18e iommu/io-pgtable-arm: Rationalise TCR handling
Although it's conceptually nice for the io_pgtable_cfg to provide a
standard VMSA TCR value, the reality is that no VMSA-compliant IOMMU
looks exactly like an Arm CPU, and they all have various other TCR
controls which io-pgtable can't be expected to understand. Thus since
there is an expectation that drivers will have to add to the given TCR
value anyway, let's strip it down to just the essentials that are
directly relevant to io-pgtable's inner workings - namely the various
sizes and the walk attributes.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Add missing include of bitfield.h]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:52:24 +00:00
Robin Murphy
d1e5f26f14 iommu/io-pgtable-arm: Rationalise TTBRn handling
TTBR1 values have so far been redundant since no users implement any
support for split address spaces. Crucially, though, one of the main
reasons for wanting to do so is to be able to manage each half entirely
independently, e.g. context-switching one set of mappings without
disturbing the other. Thus it seems unlikely that tying two tables
together in a single io_pgtable_cfg would ever be particularly desirable
or useful.

Streamline the configs to just a single conceptual TTBR value
representing the allocated table. This paves the way for future users to
support split address spaces by simply allocating a table and dealing
with the detailed TTBRn logistics themselves.

Tested-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
[will: Drop change to ttbr value]
Signed-off-by: Will Deacon <will@kernel.org>
2020-01-10 15:39:23 +00:00
Robin Murphy
205577ab6f iommu/io-pgtable-arm: Rationalise MAIR handling
Between VMSAv8-64 and the various 32-bit formats, there is either one
64-bit MAIR or a pair of 32-bit MAIR0/MAIR1 or NMRR/PMRR registers.
As such, keeping two 64-bit values in io_pgtable_cfg has always been
overkill.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
2019-11-04 19:59:30 +00:00
Joerg Roedel
4c00889341 Merge branch 'arm/smmu' into arm/mediatek 2019-08-30 16:12:10 +02:00
Yong Wu
4c019de653 iommu/io-pgtable-arm-v7s: Extend to support PA[33:32] for MediaTek
MediaTek extend the arm v7s descriptor to support up to 34 bits PA where
the bit32 and bit33 are encoded in the bit9 and bit4 of the PTE
respectively. Meanwhile the iova still is 32bits.

Regarding whether the pagetable address could be over 4GB, the mt8183
support it while the previous mt8173 don't, thus keep it as is.

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-30 15:57:26 +02:00
Yong Wu
73d50811bc iommu/io-pgtable-arm-v7s: Rename the quirk from MTK_4GB to MTK_EXT
In previous mt2712/mt8173, MediaTek extend the v7s to support 4GB dram.
But in the latest mt8183, We extend it to support the PA up to 34bit.
Then the "MTK_4GB" name is not so fit, This patch only change the quirk
name to "MTK_EXT".

Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-08-30 15:57:26 +02:00
Will Deacon
3951c41af4 iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->tlb_add_page()
With all the pieces in place, we can finally propagate the
iommu_iotlb_gather structure from the call to unmap() down to the IOMMU
drivers' implementation of ->tlb_add_page(). Currently everybody ignores
it, but the machinery is now there to defer invalidation.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:59 +01:00
Will Deacon
a2d3a382d6 iommu/io-pgtable: Pass struct iommu_iotlb_gather to ->unmap()
Update the io-pgtable ->unmap() function to take an iommu_iotlb_gather
pointer as an argument, and update the callers as appropriate.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:59 +01:00
Will Deacon
e953f7f2fa iommu/io-pgtable: Remove unused ->tlb_sync() callback
The ->tlb_sync() callback is no longer used, so it can be removed.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:58 +01:00
Will Deacon
abfd6fe0cd iommu/io-pgtable: Replace ->tlb_add_flush() with ->tlb_add_page()
The ->tlb_add_flush() callback in the io-pgtable API now looks a bit
silly:

  - It takes a size and a granule, which are always the same
  - It takes a 'bool leaf', which is always true
  - It only ever flushes a single page

With that in mind, replace it with an optional ->tlb_add_page() callback
that drops the useless parameters.

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:57 +01:00
Will Deacon
10b7a7d912 iommu/io-pgtable-arm: Call ->tlb_flush_walk() and ->tlb_flush_leaf()
Now that all IOMMU drivers using the io-pgtable API implement the
->tlb_flush_walk() and ->tlb_flush_leaf() callbacks, we can use them in
the io-pgtable code instead of ->tlb_add_flush() immediately followed by
->tlb_sync().

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:57 +01:00
Will Deacon
3445545b22 iommu/io-pgtable: Introduce tlb_flush_walk() and tlb_flush_leaf()
In preparation for deferring TLB flushes to iommu_tlb_sync(), introduce
two new synchronous invalidation helpers to the io-pgtable API, which
allow the unmap() code to force invalidation in cases where it cannot be
deferred (e.g. when replacing a table with a block or when TLBI_ON_MAP
is set).

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-29 17:22:55 +01:00
Will Deacon
298f78895b iommu/io-pgtable: Rename iommu_gather_ops to iommu_flush_ops
In preparation for TLB flush gathering in the IOMMU API, rename the
iommu_gather_ops structure in io-pgtable to iommu_flush_ops, which
better describes its purpose and avoids the potential for confusion
between different levels of the API.

$ find linux/ -type f -name '*.[ch]' | xargs sed -i 's/gather_ops/flush_ops/g'

Signed-off-by: Will Deacon <will@kernel.org>
2019-07-24 13:32:33 +01:00
Will Deacon
4f41845b34 iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag
IO_PGTABLE_QUIRK_NO_DMA is a bit of a misnomer, since it's really just
an indication of whether or not the page-table walker for the IOMMU is
coherent with the CPU caches. Since cache coherency is more than just a
quirk, replace the flag with its own field in the io_pgtable_cfg
structure.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
2019-06-25 12:51:25 +01:00
Rob Herring
d08d42de64 iommu: io-pgtable: Add ARM Mali midgard MMU page table format
ARM Mali midgard GPU is similar to standard 64-bit stage 1 page tables, but
have a few differences. Add a new format type to represent the format. The
input address size is 48-bits and the output address size is 40-bits (and
possibly less?). Note that the later bifrost GPUs follow the standard
64-bit stage 1 format.

The differences in the format compared to 64-bit stage 1 format are:

The 3rd level page entry bits are 0x1 instead of 0x3 for page entries.

The access flags are not read-only and unprivileged, but read and write.
This is similar to stage 2 entries, but the memory attributes field matches
stage 1 being an index.

The nG bit is not set by the vendor driver. This one didn't seem to matter,
but we'll keep it aligned to the vendor driver.

Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190409205427.6943-2-robh@kernel.org
2019-04-12 12:52:38 -05:00
Rob Herring
b77cf11f09 iommu: Allow io-pgtable to be used outside of drivers/iommu/
Move io-pgtable.h to include/linux/ and export alloc_io_pgtable_ops
and free_io_pgtable_ops. This enables drivers outside drivers/iommu/ to
use the page table library. Specifically, some ARM Mali GPUs use the
ARM page table formats.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: iommu@lists.linux-foundation.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2019-02-11 11:26:48 +01:00