Commit Graph

25 Commits

Author SHA1 Message Date
Faiyaz Mohammed
a9db11ad55 arm-smmu: Add clock and regulator vote in qcom io page table sync path
Below path is doing the iotlb flush without clock and regulator vote
which is resulting in NOC error, to avoid unclocked access adding the
clock and regulator vote in qcom io page table sync path.

 __arm_smmu_flush_iotlb_all[arm_smmu]+0x88
 arm_smmu_qcom_tlb_sync[arm_smmu]+0x1c
 arm_lpae_install_table[qcom_iommu_util]+0x60
 __arm_lpae_map[qcom_iommu_util]+0x290
 __arm_lpae_map[qcom_iommu_util]+0x7b0
 arm_lpae_map_sg[qcom_iommu_util][jt]+0x348
 _iopgtbl_map_sg[msm_kgsl]+0x8c
 kgsl_iopgtbl_map[msm_kgsl]+0xec.

Change-Id: I65c7b0c2e707192b66f4f86e3eb1bd97a818f43e
Signed-off-by: Faiyaz Mohammed <quic_faiyazm@quicinc.com>
2022-03-03 10:50:10 +05:30
Harshdeep Dhatt
50e7777431 arm-smmu-qcom: Export QCOM io-pagetables for adreno smmu
Do the groundwork for kgsl to be able to use QCOM io-pagetables
to make maps/unmaps faster.

Change-Id: Ib1b484e1e0ba21aaf8e9c0cac1c100cc981a6825
Signed-off-by: Harshdeep Dhatt <quic_hdhatt@quicinc.com>
2022-01-26 11:08:25 -07:00
Patrick Daly
b1a73a49b0 arm-smmu: Follow break-before-make sequence
Move tlb maintenance calls for arm_smmu_map() to occur prior to updating
affected PTEs, as required by ARM's break-before-make sequence.

After:
    size        iommu_map_sg      iommu_unmap
      4K            0.269 us         1.735 us
     64K            0.622 us         1.735 us
      1M            3.731 us         1.760 us
      2M            8.081 us         1.756 us
     12M           36.803 us         2.304 us
     24M           70.658 us         2.774 us
     32M           92.346 us         3.046 us

Before:
    size        iommu_map_sg      iommu_unmap
      4K            0.304 us         1.712 us
     64K            0.467 us         1.725 us
      1M            3.798 us         1.745 us
      2M            6.476 us         1.713 us
     12M           39.031 us         2.157 us
     24M           71.847 us         2.455 us
     32M           93.852 us         2.697 us

Change-Id: Ie67e340ca5e598fa283a4a1a869dc38cc8565d96
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
2021-10-08 17:16:09 -07:00
Patrick Daly
bd5e7787e1 arm-smmu-trace: Add trace event for creating L2 pagetables
Trace IPA address of L2 pagetables when they are created or
destroyed.

Change-Id: Ia329893f4ac686fff124c883cec9307119095184
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
2021-09-24 13:21:14 -07:00
Patrick Daly
a5ba308b8c qcom-io-pgtable-arm: Add support for IOMMU_CACHE_IWBRWA_OWBRA
GPU performance improvements have been observed when using io-coherent
no-write-allocate memory instead of normal io-coherent memory.

Change-Id: Ie0a82c83ef66030145b6a86dddb19817d2f35422
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
2021-07-21 11:18:40 -07:00
Ivaylo Georgiev
2e1928fcac Merge keystone/android12-5.10-keystone-qcom-release.43+ (023540d) into msm-5.10
* refs/heads/tmp-023540d:
  ANDROID: media: v4l2-core: extend the v4l2 subdev ioctl to support request
  ANDROID: logbuf: Remove if directive for vendor hooks
  ANDROID: iommu/io-pgtable-arm: Add IOMMU_CACHE_ICACHE_OCACHE_NWA
  FROMGIT: mac80211_hwsim: add concurrent channels scanning support over virtio
  ANDROID: GKI: update allowed symbols for exynosauto soc
  ANDROID: GKI: initial upload list for exynosauto soc
  ANDROID: logbuf: Add new logbuf vendor hook to support pr_cont()
  ANDROID: lib: Export show_mem() for vendor module usage

Change-Id: I80d229e406ef5e8ee8de7cbb058eeabfe517dca8
Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org>
2021-07-06 01:39:38 -07:00
Isaac J. Manjarres
a821d98050 qcom-io-pgtable-arm: Free underlying page tables for larger mappings
Consider the case where a 2N--where N > 1--MB buffer is composed
entirely of 4 KB pages. This means that at the second to last level,
the buffer will have N non-leaf entries that point to page tables
with 4 KB mappings.

When the buffer is unmapped, all N entries will be cleared at the
second to last level. However, the existing logic only checks if
it needs to free the underlying page tables for the first non-leaf
entry. Therefore, the page table memory for the other entries N-1
entries will be leaked.

Fix this memory leak by ensuring that we apply the same check to
all N entries that are being unmapped.

Fixes: c87ee21c03 ("qcom-io-pgtable-arm: Implement the unmap_pages() callback")
Change-Id: Ieb2ff3dece20739fe42d5dbef507e25bb25d428b
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-04-29 15:54:50 -07:00
Isaac J. Manjarres
8c48ddb4e4 qcom-io-pgtable-arm: Fix PTE calculation for unmapping last level entries
When unmapping last level entries, the calculation for the number of
entries to unmap erroneously uses the second to last level, instead of
the last level.

Correctly calculate the number of entries to unmap using the correct
level.

Fixes: c87ee21c03 ("qcom-io-pgtable-arm: Implement the unmap_pages() callback")
Change-Id: I841b84e79493ce07132bdf6286bad6e6cc26bf90
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-04-29 15:23:59 -07:00
Isaac J. Manjarres
ddacdb4da6 qcom-io-pgtable-arm: Implement map_pages() callback
Implement the map_pages() callback for the qcom-io-pgtable-arm
format code.

Change-Id: I814b9fa50c2a59e350a44c7d681a0a173126a444
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-04-22 13:25:42 -07:00
Isaac J. Manjarres
c87ee21c03 qcom-io-pgtable-arm: Implement the unmap_pages() callback
Implement the unmap_pages() for the qcom-io-pgtable-arm format
code.

Change-Id: I2e94b25c736cdc5ee1346941f986c5ddeca30ba4
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-04-22 13:25:18 -07:00
Isaac J. Manjarres
05dc00c7ec qcom-io-pgtable-arm: Prepare PTE methods for handling multiple entries
The PTE methods currently operate on a single entry. In preparation
for manipulating multiple PTEs in one map or unmap call, allow them
to handle multiple PTEs.

Change-Id: I32a93b7bba77aa805674adc1424f15472f378c26
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-04-22 13:24:58 -07:00
Patrick Daly
03b53d05e5 arm-smmu: Fix missing TLB maintenance
Below commit deferred TLB maintenance until iommu_iotlb_sync().

Thread 1:
iommu_map IOVA RANGE [0x1000, 0x2000)
iommu_unmap IOVA RANGE [0x1000, 0x2000)
-> acquire pagetable lock
-> set PTE to 0. Since this is the last PTE in the PMD, the PMD will
be freed
-> release pagetable lock
-> Schedule out

Thread 2:
iommu_map IOVA RANGE [0x2000, 0x3000)
->acquire pagetable lock
-> allocate & install new PMD, PTE
-> release pagetable lock
-> return

Ooops! Hw may now access the new PMD before TLB maintenance was completed
on the old PMD.

Thread 1:
-> iommu_iotlb_sync()
-> return

Fix this by implementing the iotlb_sync_map callback. This causes a
noticeable performance regression in iommu_map for 4Kb size. Larger
sizes are less affected because map_sync() is called once regardless
of size.

Before:
    size           iommu_map      iommu_unmap
      4K            0.252 us         1.660 us
     64K            0.969 us         2.230 us
      1M           12.474 us        11.242 us

After:
    size           iommu_map      iommu_unmap
      4K            0.284 us         1.698 us +11%
     64K            1.027 us         2.350 us  +6%
      1M           12.533 us        10.993 us

Fixes: c38ad44557 ("iommu/arm-smmu: Defer TLB maintenance until after a buffer is unmapped")
Change-Id: I084a8844321c8e9240cedf9015a307111b27e210
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
2021-04-19 15:50:04 -07:00
Patrick Daly
5c18eeddd6 arm-smmu: Fix scheduling while atomic
Move the lock protecting the page table contents into
qcom-io-pgtable-arm.c. This allows the pagetable code to selectively
drop the lock, allocate memory with the passed-in gfp flags, and
reaquire the lock.

Override the gfp_flags passed by iommu_map according to what the
client specifies in DOMAIN_ATTR_ATOMIC. This works around an issue
where the upstream dma-api always gives GFP_ATOMIC to iommu_map().
The dma-api does this because it does not have a way of knowing whether
the flags should be GFP_KERNEL or GFP_ATOMIC, and so it picks the more
conservative option.

Finally, remove the pagetable memory preallocation hack. Since GFP_KERNEL
is being used for page-table memory allocation, there is much reduced
chance of temporary failure due gfp_flags preventing swap/reclaim.

Change-Id: Ic369b12f77444ed3663eef794b671ce69b25fe8f
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
2021-04-02 15:16:21 -07:00
Ivaylo Georgiev
86e90c034c Merge android12-5.10.21+ (44f812e) into msm-5.10
* refs/heads/tmp-44f812e:
  ANDROID: sched/core: Move en/dequeue hooks before related callbacks
  FROMGIT: kasan: record task_work_add() call stack
  FROMGIT: kasan, mm: integrate slab init_on_free with HW_TAGS
  FROMGIT: kasan, mm: integrate slab init_on_alloc with HW_TAGS
  FROMGIT: kasan, mm: integrate page_alloc init with HW_TAGS
  FROMGIT: mm: introduce debug_pagealloc_{map,unmap}_pages() helpers
  FROMGIT: mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO
  FROMGIT: mm/page_alloc: clear all pages in post_alloc_hook() with init_on_alloc=1
  FROMGIT: mm, page_poison: remove CONFIG_PAGE_POISONING_NO_SANITY
  FROMGIT: kernel/power: allow hibernation with page_poison sanity checking
  FROMGIT: mm, page_poison: use static key more efficiently
  BACKPORT: mm, page_alloc: do not rely on the order of page_poison and init_on_alloc/free parameters
  FROMGIT: kasan: init memory in kasan_(un)poison for HW_TAGS
  FROMGIT: arm64: kasan: allow to init memory when setting tags
  FROMGIT: mm, kasan: don't poison boot memory with tag-based modes
  FROMGIT: kasan: initialize shadow to TAG_INVALID for SW_TAGS
  FROMGIT: mm/kasan: switch from strlcpy to strscpy
  BACKPORT: kasan: remove redundant config option
  FROMGIT: kasan: fix per-page tags for non-page_alloc pages
  FROMGIT: kasan: fix KASAN_STACK dependency for HW_TAGS
  FROMGIT: kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC
  FROMGIT: arm64: kasan: fix page_alloc tagging with DEBUG_VIRTUAL
  FROMLIST: configfs: make directories inherit uid/gid from creator
  ANDROID: GKI: add some padding to some driver core structures
  ANDROID: Initial Android 12 OWNERS for abi metafiles
  UPSTREAM: iommu/msm: Hook up iotlb_sync_map
  UPSTREAM: memory: mtk-smi: Allow building as module
  UPSTREAM: memory: mtk-smi: Use platform_register_drivers
  UPSTREAM: iommu/mediatek: Fix error code in probe()
  UPSTREAM: iommu/mediatek: Fix unsigned domid comparison with less than zero
  UPSTREAM: iommu/mediatek: Add mt8192 support
  UPSTREAM: memory: mtk-smi: Add mt8192 support
  UPSTREAM: iommu/mediatek: Remove unnecessary check in attach_device
  UPSTREAM: iommu/mediatek: Support master use iova over 32bit
  UPSTREAM: iommu/mediatek: Add iova reserved function
  UPSTREAM: iommu/mediatek: Support for multi domains
  UPSTREAM: iommu/mediatek: Add get_domain_id from dev->dma_range_map
  UPSTREAM: iommu/mediatek: Add iova_region structure
  UPSTREAM: iommu/mediatek: Move geometry.aperture updating into domain_finalise
  UPSTREAM: iommu/mediatek: Move domain_finalise into attach_device
  UPSTREAM: iommu/mediatek: Adjust the structure
  UPSTREAM: iommu/mediatek: Support report iova 34bit translation fault in ISR
  UPSTREAM: iommu/mediatek: Support up to 34bit iova in tlb flush
  UPSTREAM: iommu/mediatek: Add power-domain operation
  UPSTREAM: iommu/mediatek: Add pm runtime callback
  UPSTREAM: iommu/mediatek: Add device link for smi-common and m4u
  UPSTREAM: iommu/mediatek: Add error handle for mtk_iommu_probe
  UPSTREAM: iommu/mediatek: Move hw_init into attach_device
  UPSTREAM: iommu/mediatek: Update oas for v7s
  UPSTREAM: iommu/mediatek: Add a flag for iova 34bits case
  UPSTREAM: iommu/io-pgtable-arm-v7s: Quad lvl1 pgtable for MediaTek
  UPSTREAM: iommu/io-pgtable-arm-v7s: Add cfg as a param in some macros
  UPSTREAM: iommu/io-pgtable-arm-v7s: Clarify LVL_SHIFT/BITS macro
  UPSTREAM: iommu/io-pgtable-arm-v7s: Use ias to check the valid iova in unmap
  UPSTREAM: iommu/io-pgtable-arm-v7s: Extend PA34 for MediaTek
  UPSTREAM: iommu/mediatek: Use the common mtk-memory-port.h
  UPSTREAM: dt-bindings: mediatek: Add binding for mt8192 IOMMU
  UPSTREAM: dt-bindings: memory: mediatek: Rename header guard for SMI header file
  UPSTREAM: dt-bindings: memory: mediatek: Extend LARB_NR_MAX to 32
  UPSTREAM: dt-bindings: memory: mediatek: Add a common memory header file
  UPSTREAM: dt-bindings: memory: mediatek: Convert SMI to DT schema
  UPSTREAM: dt-bindings: iommu: mediatek: Convert IOMMU to DT schema
  UPSTREAM: iommu/mediatek: Remove the tlb-ops for v7s
  UPSTREAM: iommu/io-pgtable: Remove TLBI_ON_MAP quirk
  UPSTREAM: iommu/io-pgtable: Allow io_pgtable_tlb ops optional
  UPSTREAM: iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  UPSTREAM: iommu/mediatek: Add iotlb_sync_map to sync whole the iova range
  BACKPORT: UPSTREAM: iommu: Add iova and size as parameters in iotlb_sync_map
  UPSTREAM: iommu/io-pgtable: Remove tlb_flush_leaf
  ANDROID: abi_gki_aarch64_qcom: Add symbols to allow list
  ANDROID: Add vendor hook to binder.
  ANDROID: fs: Add vendor hooks for ep_create_wakeup_source & timerfd_create
  Revert "FROMLIST: fs/buffer.c: Revoke LRU when trying to drop buffers"
  ANDROID: enable LLVM_IAS=1 for clang's integrated assembler for arm
  FROMLIST: ARM: kprobes: rewrite test-arm.c in UAL
  FROMLIST: ARM: kprobes: fix UNPREDICTABLE warnings
  UPSTREAM: ARM: efistub: replace adrl pseudo-op with adr_l macro invocation
  UPSTREAM: ARM: assembler: introduce adr_l, ldr_l and str_l macros
  UPSTREAM: ARM: 9029/1: Make iwmmxt.S support Clang's integrated assembler
  FROMGIT: binder: BINDER_GET_FROZEN_INFO ioctl
  FROMGIT: binder: use EINTR for interrupted wait for work
  BACKPORT: FROMGIT: binder: BINDER_FREEZE ioctl
  ANDROID: qcom: Add pci_dev_present to ABI
  ANDROID: GKI: Add sysfs_emit to symbol list
  ANDROID: gki_defconfig: Enable IFB, NET_SCH_TBF, NET_ACT_POLICE
  ANDROID: gki_defconfig: Enable USB_NET_CDC_NCM
  ANDROID: gki_defconfig: Enable USB_NET_AQC111
  UPSTREAM: usb: dwc3: gadget: Use max speed if unspecified
  UPSTREAM: usb: dwc3: gadget: Set gadget_max_speed when set ssp_rate
  ANDROID: freezer: export the freezer_cgrp_subsys for GKI purpose.
  UPSTREAM: usb: dwc3: qcom: skip interconnect init for ACPI probe
  FROMGIT: usb: dwc3: gadget: Ignore EP queue requests during bus reset
  FROMGIT: usb: dwc3: gadget: Avoid continuing preparing TRBs during teardown
  ANDROID: gpiolib: Add vendor hook for gpio read
  ANDROID: abi_gki_aarch64_qcom: Whitelist sched_setattr
  ANDROID: GKI: sched: add Android ABI padding to some structures
  ANDROID: GKI: mm: add Android ABI padding to some structures
  ANDROID: GKI: mount.h: add Android ABI padding to some structures
  FROMLIST: mm: fs: Invalidate BH LRU during page migration
  FROMLIST: mm: replace migrate_[prep|finish] with lru_cache_[disable|enable]
  BACKPORT: FROMLIST: mm: disable LRU pagevec during the migration temporarily
  Revert "FROMLIST: mm: replace migrate_prep with lru_add_drain_all"
  Revert "BACKPORT: FROMLIST: mm: disable LRU pagevec during the migration temporarily"
  Revert "FROMLIST: mm: fs: Invalidate BH LRU during page migration"
  ANDROID: vendor_hooks: Add hooks for account process tick
  ANDROID: usb: dwc3: gadget: Export dwc3_stop_active_transfer, dwc3_send_gadget_ep_cmd
  ANDROID: clang: update to 12.0.4
  ANDROID: vendor_hooks: Add hooks for improving binder trans
  ANDROID: GKI: Disable DTPM CPU device
  UPSTREAM: powercap/drivers/dtpm: Add the experimental label to the option description
  UPSTREAM: powercap/drivers/dtpm: Fix root node initialization
  ANDROID: GKI: sched.h: add Android ABI padding to some structures
  ANDROID: GKI: module.h: add Android ABI padding to some structures
  ANDROID: GKI: sock.h: add Android ABI padding to some structures
  ANDROID: sched/fair: Do not sync task util with SD_BALANCE_FORK
  FROMGIT: selinux: vsock: Set SID for socket returned by accept()
  ANDROID: usb: typec: tcpci: Migrate restricted vendor hook
  ANDROID: qcom: Add is_dma_buf_file to ABI
  ANDROID: GKI: update .xml file
  ANDROID: GKI: enable KFENCE by setting the sample interval to 500ms
  ANDROID: abi_gki_aarch64_qcom: Add xhci symbols to list
  ANDROID: vmlinux.lds.h: Define SANITIZER_DISCARDS with CONFIG_CFI_CLANG
  ANDROID: usb: typec: tcpci: Add vendor hook to mask vbus present
  ANDROID: usb: typce: tcpci: Add vendor hook for chip specific features
  ANDROID: usb: typec: tcpci: Add vendor hooks for tcpci interface
  FROMGIT: f2fs: add sysfs nodes to get runtime compression stat
  ANDROID: dma-buf: Fix error path on system heaps use of the page pool
  ANDROID: usb: typec: tcpm: Fix event storm caused by error in backport
  ANDROID: GKI: USB: XHCI: add Android ABI padding to lots of xhci structures
  FROMGIT: KVM: arm64: Fix host's ZCR_EL2 restore on nVHE
  FROMGIT: KVM: arm64: Force SCTLR_EL2.WXN when running nVHE
  FROMGIT: KVM: arm64: Turn SCTLR_ELx_FLAGS into INIT_SCTLR_EL2_MMU_ON
  FROMGIT: KVM: arm64: Use INIT_SCTLR_EL2_MMU_OFF to disable the MMU on KVM teardown
  FROMGIT: arm64: Use INIT_SCTLR_EL1_MMU_OFF to disable the MMU on CPU restart
  FROMGIT: KVM: arm64: Enable SVE support for nVHE
  FROMGIT: KVM: arm64: Save/restore SVE state for nVHE
  BACKPORT: FROMGIT: KVM: arm64: Trap host SVE accesses when the FPSIMD state is dirty
  FROMGIT: KVM: arm64: Save guest's ZCR_EL1 before saving the FPSIMD state
  FROMGIT: KVM: arm64: Map SVE context at EL2 when available
  BACKPORT: FROMGIT: KVM: arm64: Rework SVE host-save/guest-restore
  FROMGIT: arm64: sve: Provide a conditional update accessor for ZCR_ELx
  FROMGIT: KVM: arm64: Introduce vcpu_sve_vq() helper
  FROMGIT: KVM: arm64: Let vcpu_sve_pffr() handle HYP VAs
  FROMGIT: KVM: arm64: Use {read,write}_sysreg_el1 to access ZCR_EL1
  FROMGIT: KVM: arm64: Provide KVM's own save/restore SVE primitives
  ANDROID: GKI: USB: Gadget: add Android ABI padding to struct usb_gadget
  ANDROID: vendor_hooks: Add hooks for memory when debug
  ANDROID: vendor_hooks: Add hooks for ufs scheduler
  ANDROID: GKI: sound/usb/card.h: add Android ABI padding to struct snd_usb_endpoint
  ANDROID: GKI: user_namespace.h: add Android ABI padding to a structure
  ANDROID: GKI: timer.h: add Android ABI padding to a structure
  ANDROID: GKI: quota.h: add Android ABI padding to some structures
  ANDROID: GKI: mmu_notifier.h: add Android ABI padding to some structures
  ANDROID: GKI: mm.h: add Android ABI padding to a structure
  ANDROID: GKI: kobject.h: add Android ABI padding to some structures
  ANDROID: GKI: kernfs.h: add Android ABI padding to some structures
  ANDROID: GKI: irqdomain.h: add Android ABI padding to a structure
  ANDROID: GKI: ioport.h: add Android ABI padding to a structure
  ANDROID: GKI: iomap.h: add Android ABI padding to a structure
  ANDROID: GKI: hrtimer.h: add Android ABI padding to a structure
  ANDROID: GKI: genhd.h: add Android ABI padding to some structures
  ANDROID: GKI: ethtool.h: add Android ABI padding to a structure
  ANDROID: GKI: dma-mapping.h: add Android ABI padding to a structure
  ANDROID: GKI: networking: add Android ABI padding to a lot of networking structures
  ANDROID: GKI: blk_types.h: add Android ABI padding to a structure
  ANDROID: GKI: scsi.h: add Android ABI padding to a structure
  ANDROID: GKI: pci: add Android ABI padding to some structures
  ANDROID: GKI: add Android ABI padding to struct nf_conn

Conflicts:
	Documentation/devicetree/bindings
	include/linux/usb/gadget.h

Change-Id: Id08dc5a5299b4a780553a44a402d18e9b5b096cb
Signed-off-by: Ivaylo Georgiev <irgeorgiev@codeaurora.org>
2021-03-25 04:30:56 -07:00
Stepan Moskovchenko
7c248e8a8a qcom-io-pgtable-arm: Optimize map by batching flushes
Currently, the page table is flushed after the installation of each
individual page table entry.  This is not terribly efficient since
virtual address ranges are often mapped with physically contiguous
chunks of page table memory.  Optimize the map operation by factoring
out the page table flushing so that contiguous ranges of page table
memory can be flushed in one go.

Change-Id: Ie80eb57ef50d253db6489a6f75824d4c746314c7
Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Signed-off-by: Mitchel Humpherys <mitchelh@codeaurora.org>
Signed-off-by: Neeti Desai <neetid@codeaurora.org>
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-02-17 21:11:26 -08:00
Isaac J. Manjarres
aa2d4ba15a qcom-io-pgtable-arm: Implement the map_sg() callback
Implement the map_sg() callback for the qcom-io-pgtable-arm
format code. Having the map_sg() callback, instead of the
default implementation which calls iommu_map() per sg entry
results in less indirect calls, and thus, better performance.

Change-Id: I275fa47b66ddc425c02bb9e93b83b91980485304
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-02-17 21:10:47 -08:00
Isaac J. Manjarres
c38ad44557 iommu/arm-smmu: Defer TLB maintenance until after a buffer is unmapped
TLB maintenance is currently performed while a buffer is being unmapped.
For large buffers, doing so is suboptimal, as opposed to invalidating
the entire TLB for a particular context after the buffer has been
unmapped.

Thus, defer TLB maintenance until a buffer is unmapped, and
iommu_iotlb_sync() is invoked. This shows a significant amount
of improvement in the latency incurred by unmapping a buffer.

Without this patch, we observe the following latencies:

(average over 10 iterations)
    size           iommu_map      iommu_unmap
      4K            1.265 us       651.619 us
     64K            7.666 us       678.968 us
      1M           90.979 us      1152.072 us
      2M          179.885 us      2303.020 us
     12M         1082.140 us      5537.349 us
     24M         2159.463 us      9415.588 us
     32M         2878.609 us     12001.406 us

    size        iommu_map_sg      iommu_unmap
      4K            1.088 us       647.921 us
     64K            7.208 us       680.312 us
      1M          103.505 us      1153.520 us
      2M          200.885 us      2302.593 us
     12M         1159.146 us      5534.989 us
     24M         2300.744 us      9411.614 us
     32M         3057.343 us     12000.468 us

While applying this patch yields the following latencies:

(average over 10 iterations)
    size           iommu_map      iommu_unmap
      4K            1.172 us         5.218 us
     64K            6.229 us         9.338 us
      1M           91.812 us        77.828 us
      2M          179.500 us       154.156 us
     12M         1077.927 us       154.572 us
     24M         2159.630 us       157.453 us
     32M         2883.953 us       157.921 us

    size        iommu_map_sg      iommu_unmap
      4K            1.041 us         5.005 us
     64K            6.781 us         9.364 us
      1M          102.390 us        79.515 us
      2M          200.328 us       152.270 us
     12M         1161.000 us       154.515 us
     24M         2304.369 us       157.822 us
     32M         3059.416 us       160.734 us.

Change-Id: I7aecf559746eb65d2543ce9b16ad12492eb70fa1
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-02-17 12:22:20 -08:00
Isaac J. Manjarres
890731edea qcom-io-pgtable-arm: Free empty page tables when memory is unmapped
Page table memory is currently freed when an empty table is encountered
and we are about to replace the table entry with a block mapping.
Removing the table entry implies that TLB maintenance must be performed
to get rid of TLB entries that may refer to the now empty table as part
of the page table walk.

In preparation for deferring TLB maintenance to only when a buffer has
been completely unmapped, add support for freeing empty page tables when
memory is unmapped, so no TLB maintenance is required when mapping memory.

Change-Id: Ic2ffc8ed38d1df2443844fe69a50e2c06484f648
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-02-17 12:22:15 -08:00
Isaac J. Manjarres
f6ebf83337 qcom-io-pgtable-arm: Allow caching buffers in system cache with NWA policy
Add support for mapping memory with the attributes required
for it to be cached in the system cache, with a NWA policy:

MAIR: 0xe4: inner non-cacheable, outer write-back read allocate.

Change-Id: I4baa9bc32e20c2736867bb9871b3fcdedab0bafb
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-01-14 13:59:46 -08:00
Isaac J. Manjarres
12df5565bc qcom-io-pgtable-arm: Allow caching buffers in the system cache
Add support for mapping memory with the attributes required
for it to be cached in the system cache:

MAIR: 0xf4: inner non-cacheable, outer write-back read/write allocate.

Change-Id: I1fb59d272223cc2a0d34250e7442fafb7190475d
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-01-14 13:59:46 -08:00
Isaac J. Manjarres
d50643f773 qcom-io-pgtable-arm: Add support for IO_PGTABLE_QUIRK_QCOM_USE_LLC_NWA
The IO_PGTABLE_QUIRK_QCOM_USE_LLC_NWA quirk is used to ensure that
the IOMMU page tables are cached in the system cache with a no write
allocation cache policy. Add support for it by setting up the TCR
with the following memory attributes for the page table walker:

TCR.SH = Outer-shareable
TCR.IRGN = Non-cacheable normal memory
TCR.ORGN = Write-back, no write-allocate cacheable.

Change-Id: Ifa88b673de3b756e5b03bc36e89db84bc013346a
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-01-12 18:20:32 -08:00
Isaac J. Manjarres
bc74b78bc5 qcom-io-pgtable-arm: Add support for the QCOM_USE_UPSTREAM_HINT quirk
The IO_PGTABLE_QUIRK_QCOM_USE_UPSTREAM_HINT is used to ensure that the
IOMMU page tables are cached in the system cache. Add support for it
by setting up the TCR with the following memory attributes for the
page table walker:

TCR.SH = Outer Shareable
TCR.IRGN = Non-cacheable normal memory
TCR.ORGN = Write-back, write-allocate cacheable.

Change-Id: Iafb16fdee078af746a66821bb50192198beba5bc
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-01-12 18:20:32 -08:00
Isaac J. Manjarres
a730210d5d qcom-io-pgtable-arm: Defer page table allocations to the IOMMU driver
Add the initial implementation for an IOMMU driver to allocate
and free page table memory. The IOMMU driver implementation of these
hooks will take care of preparing the page tables prior to use.

Change-Id: I1c3ef02fc9464a31d0e0ce65627692d53aa0f976
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-01-12 18:20:32 -08:00
Patrick Daly
767329165c qcom_iommu_util: Add support for qcom-io-pgtable-arm
Add support for using this submodule.

Change-Id: I3658589b1d38ddcdf8bc9d2c01f6042cfeb964d5
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-01-12 18:20:32 -08:00
Patrick Daly
d80793c7cc iommu: Duplicate io-pgtable-arm functionality
Create a fork of io-pgtable-arm as of android12-5.10
commit 19057a6a6b ("Merge 5.10.4 into android12-5.10").

This is done in order to support qcom value added features such as:
qcom secure memory model.
refcounting & freeing page table memory.
map_sg operation
intelligent tlb invalidate operations.

Some of the above feature may be outdated or no longer required
with new hardware, or may have upstream alternatives. However,
proving that these features are unnessary may require extensive
testing. Therefore, port them to the GKI model so that this
testing may take place.

Change-Id: I95112cc260d3d254e7703513818b21e066d69978
Signed-off-by: Patrick Daly <pdaly@codeaurora.org>
Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org>
2021-01-12 18:20:32 -08:00