Add the IOCTL interface so that applications can allocate per VM BOs.
Still WIP since not all corner cases are tested yet, but this reduces average
CS overhead for 10K BOs from 21ms down to 48us.
v2: add some extra checks, remove the WIP tag
v3: rename new flag to AMDGPU_GEM_CREATE_VM_ALWAYS_VALID
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Per VM BOs are handled like VM PDs and PTs. They are always valid and don't
need to be specified in the BO lists.
v2: validate PDs/PTs first
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to refer to the parent instead of the root BO for multi
level page tables on Vega10. Also don't set the PDE_PTE bit.
v2: Don't set the PDE_PTE bit either.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We changed this to use an extra list a while back, but for the next
series I need a separate flag again.
v2: reorder to avoid unlocked list access
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of using the vm_state use a separate flag to note
that the BO was moved.
v2: reorder patches to avoid temporary lockless access
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As there is no PPTable in RV, it is difficult to
cleanly decouple PPTABLE functionality in existing
codes.
v2: agd: squash in clean build fix
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To allocate additional space for the dynamic cu masks.
Confirmed with the hw team that we only need 1 dword
for the mask. The mask is the same for each SE so
you only need 1 dword.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It was unused and according to hw team, it's the same for
all asics in a gfx family so remove it.
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stop checking the mapped BO itself, cause that one is
certainly not a page table.
Additional to that move the code into amdgpu_vm.c
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add 2 debugfs files, one that contains the VBIOS version, and one that
contains the VBIOS itself. These won't change after initialization,
so we can add the VBIOS version when we parse the atombios information.
This ensures that we can find out the VBIOS version, even when the dmesg
buffer fills up, and makes it easier to associate which VBIOS version is
for which GPU on mGPU configurations. Set the size to 20 characters in
case of some weird VBIOS version that exceeds the expected 17 character
format (3-8-3\0). The VBIOS dump also allows for easy debugging
v2: Move to debugfs, clarify commit message, add VBIOS dump file
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Switches the AMDGPU driver over to the TTM tracepoint and removes
our old one. Now you can enable traces before loading the module
and trace all mappings.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(v2): Use struct device instead of pci in trace.
Newer versions of the CP firmware require changes in how the driver
initializes the hw block.
Change the firmware name for new firmware to maintain compatibility with
older kernels.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove a redundant identical return statement, it has no use.
Detected by CoverityScan, CID#1454586 ("Structurally dead code")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check memory allocation failure and return -ENOMEM in such a case.
'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.
The calling graph would be, in such a case!
failure in amdgpu_cs_process_syncobj_out_dep()
---> error code returned by amdgpu_cs_dependencies()
--> amdgpu_cs_parser_fini() is called
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
BANK_SELECT should always be FRAGMENT_SIZE + 3 due to 8-entry (2^3)
per cache line in L2 TLB for Vega10.
v2: agd: fix warning
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use ttm_bo_mem_space instead of manually allocating GART space.
This allows us to evict BOs when there isn't enought GART space any more.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>