Huang Rui
1f43631be5
drm/amdgpu: add gfx v10 clear state header v2
...
Clear state for gfx pipe.
v2: squash in updates
Signed-off-by: Huang Rui <ray.huang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:16:37 -05:00
Huang Rui
a9833d02b5
drm/amdgpu: add v10 structs header (v2)
...
Header for CP structures (MQD, etc.)
V2: squash in updates
Signed-off-by: Huang Rui <ray.huang@amd.com >
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:16:37 -05:00
Hawking Zhang
35c2e91059
drm/amdgpu: parse the new members added by gpu_info ucode v1_1
...
Parse the new parameters for gfx10.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:16:16 -05:00
Hawking Zhang
109c80ddb4
drm/amdgpu: add gpu_info_firmware v1_1 structure for navi10
...
two new members that specific for navi10 are included in v2_0:
num_sc_per_sh and num_packer_per_sc
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:16:11 -05:00
Huang Rui
23c6268eb1
drm/amdgpu: add navi10 gpu info firmware
...
gpu info firmware stores configuration data for various
IP blocks.
Signed-off-by: Huang Rui <ray.huang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:15:29 -05:00
Hawking Zhang
3e514732c0
drm/amdgpu: add gfx10 specific new member pa_sc_tile_steering_override
...
New gfx config parameter.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:15:01 -05:00
Hawking Zhang
02a9e40a83
drm/amdgpu: add gfx10 specific config in amdgpu_gfx_config
...
The two members are used to cache the values from gpu_info fw accordingly
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:14:57 -05:00
Hawking Zhang
5228fe3010
drm/amdgpu: Add GDDR6 in vram_name arrary
...
For printing vram type.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Tao Zhou <Tao.Zhou1@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 21:14:54 -05:00
Huang Rui
852a6626d5
drm/amdgpu: add navi10 asic type
...
Signed-off-by: Huang Rui <ray.huang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:56 -05:00
Hawking Zhang
33934b3576
drm/amdgpu: add navi10 ip offset header
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:53 -05:00
Hawking Zhang
76a2d0b0a1
drm/amdgpu: add doorbell assignement for navi10
...
Update mappings for Navi10.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:51 -05:00
Hawking Zhang
10e4b22735
drm/amdgpu: atomfirmware.h updates for navi10
...
Updated tables for Navi10.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:48 -05:00
Hawking Zhang
efd8725f03
drm/amdgpu: add navi10 enums header
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:46 -05:00
Hawking Zhang
d2996831b2
drm/amdgpu: add SMUIO 11.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:44 -05:00
Hawking Zhang
3d220cc3bd
drm/amdgpu: add OSS 5.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:42 -05:00
Hawking Zhang
f519f0be45
drm/amdgpu: add MMHUB 2.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:39 -05:00
Hawking Zhang
be4008b8c5
drm/amdgpu: add GC 10.1 register headers (v4)
...
v2: Update regs (Alex)
v3: More updates (Alex)
v4: more updates (Alex)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:35 -05:00
Hawking Zhang
326354fa97
drm/amdgpu: add VCN 2.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:33 -05:00
Hawking Zhang
9edefe7bac
drm/amdgpu: add NBIO 2.3 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:31 -05:00
Hawking Zhang
d33ad04027
drm/amdgpu: add MP 11.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:28 -05:00
Hawking Zhang
2a3196f1f0
drm/amdgpu: add HDP 5.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:26 -05:00
Hawking Zhang
d6ad5023e8
drm/amdgpu: add DCN 2.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:23 -05:00
Hawking Zhang
ae213c4450
drm/amdgpu: add CLK 11.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:21 -05:00
Hawking Zhang
db3239f535
drm/amdgpu: add ATHUB 2.0 register headers
...
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 15:54:18 -05:00
Alex Deucher
4f0793989f
Revert "drm/amd/display: Copy stream updates onto streams"
...
This reverts commit 6e5155ae6b
.
Revert this to apply the version that includes DCN2 support.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 12:30:01 -05:00
Alex Deucher
1a1da391c9
Revert "drm/amd/display: Use macro for invalid OPP ID"
...
This reverts commit 1760bd06c8
.
Revert this to apply the version that includes DCN2 support.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 12:29:08 -05:00
Alex Deucher
ecbc382c9f
Revert "drm/amd/display: Rework CRTC color management"
...
This reverts commit 7cd4b70091
.
Revert this to apply the version that includes DCN2 support.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 12:27:54 -05:00
Alex Deucher
f94ec6f8b8
Revert "drm/amd/display: move vmid determination logic out of dc"
...
This reverts commit 11cd74cdb9
.
Revert this to apply the version that includes DCN2 support.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 12:26:57 -05:00
Alex Deucher
0198b6e5be
Revert "drm/amd/display: Add Underflow Asserts to dc"
...
This reverts commit 9ed43ef84d
.
Revert this to apply the version that includes DCN2 support.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 12:25:54 -05:00
Alex Deucher
76d981a9fe
Revert "drm/amd/display: make clk_mgr call enable_pme_wa"
...
This reverts commit a1651530a3
.
Revert this to apply the version that includes DCN2 support.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 12:24:18 -05:00
Nicholas Kazlauskas
70a1efac71
Revert "drm/amd/display: Enable fast plane updates when state->allow_modeset = true"
...
This reverts commit ebc8c6f18322ad54275997a888ca1731d74b711f.
There are still missing corner cases with cursor interaction and these
fast plane updates on Picasso and Raven2 leading to endless PSTATE
warnings for typical desktop usage depending on the userspace.
This change should be reverted until these issues have been resolved.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110949
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com >
Reviewed-by: David Francis <david.francis@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:38:53 -05:00
Jack Zhang
a95ecb653a
drm/amdgpu/sriov: fix Tonga load driver failed
...
Tonga sriov need to use smu to load firmware.
Remove sriov flag because the default return value is zero.
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com >
Reviewed-by: Trigger Huang <Trigger.Huang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:36:45 -05:00
Jonathan Kim
9c7c85f7ea
drm/amdgpu: add pmu counters
...
adding perf event counters
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com >
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:36:22 -05:00
Jonathan Kim
e4cf4bf5b8
drm/amdgpu: update df_v3_6 for xgmi perfmons (v2)
...
add pmu attribute groups and structures for perf events.
add sysfs to track available df perfmon counters
fix overflow handling in perfmon counter reads.
v2: squash in fix (Alex)
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com >
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:35:45 -05:00
Roman Li
496091fa04
drm/amd/display: Fix null-deref on vega20 with xgmi
...
[Why]
After clkmgr rework it gets initialized after resource pool.
The clkmgr is used in resource pool init for xgmi path.
That causes driver crash on Vega20 with xgmi due to NULL deref.
[How]
Move xgmi compensation code to dce121_clk_mgr_construct()
That also allows to make dce121_clock_patch_xgmi_ss_info()
internal static function.
Signed-off-by: Roman Li <Roman.Li@amd.com >
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:34:32 -05:00
Kent Russell
de9f26bbd3
drm/amdkfd: Add procfs-style information for KFD processes
...
Add a folder structure to /sys/class/kfd/kfd/ called proc which contains
subfolders, each representing an active KFD process' PID, containing 1
file: pasid.
Signed-off-by: Kent Russell <kent.russell@amd.com >
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com >
Acked-by: Christian König <christian.koenig@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:34:00 -05:00
Philip Yang
e82fdb16a0
drm/amdgpu: improve HMM error -ENOMEM and -EBUSY handling
...
Under memory pressure, hmm_range_fault may return error code -ENOMEM
or -EBUSY, change pr_info to pr_debug to remove unnecessary kernel log
message because we will retry restore again.
Call get_user_pages_done if TTM get user pages failed will have
WARN_ONCE kernel calling stack dump log.
Signed-off-by: Philip Yang <Philip.Yang@amd.com >
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:33:41 -05:00
Tom St Denis
c1d827d62f
drm/amd/amdgpu: cast mem->num_pages to 64-bits when shifting (v2)
...
On 32-bit hosts mem->num_pages is 32-bits and can overflow
when shifted. Add a cast to avoid this.
(v2): Style fix.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:32:24 -05:00
xinhui pan
acb05f0a3f
drm/amdgpu: Do error injection even vram reserve fails
...
As long as the address is mapped with vram, we can do an error
injection.
Signed-off-by: xinhui pan <xinhui.pan@amd.com >
Acked-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-20 11:32:10 -05:00
Daniel Vetter
52d2d44eee
Merge v5.2-rc5 into drm-next
...
Maarten needs -rc4 backmerged so he can pull in the fbcon notifier
removal topic branch into drm-misc-next.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch >
2019-06-19 12:07:29 +02:00
Alex Deucher
21a249ca02
drm/amdgpu: wait to fetch the vbios until after common init
...
We need the asic_funcs set for the get rom callbacks in some
cases.
Tested-by: Kent Russell <kent.russell@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:02:03 -05:00
Markus Elfring
b934152170
drm/amd/powerplay: Delete a redundant memory setting in vega20_set_default_od8_setttings()
...
The memory was set to zero already by a call of the function “kzalloc”.
Thus remove an extra call of the function “memset” for this purpose.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:02:03 -05:00
Markus Elfring
4fe7d1a8a4
drm/amd/display: Delete a redundant memory setting in amdgpu_dm_irq_register_interrupt()
...
The memory was set to zero already by a call of the function “kzalloc”.
Thus remove an extra call of the function “memset” for this purpose.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:02:03 -05:00
Arnd Bergmann
e1a2f2d23a
drm/amdgpu: fix error handling in df_v3_6_pmc_start
...
When df_v3_6_pmc_get_ctrl_settings() fails for some reason, we
store uninitialized data in a register, as gcc points out:
drivers/gpu/drm/amd/amdgpu/df_v3_6.c: In function 'df_v3_6_pmc_start':
drivers/gpu/drm/amd/amdgpu/amdgpu.h:1012:29: error: 'lo_val' may be used uninitialized in this function [-Werror=maybe-uninitialized]
#define WREG32_PCIE(reg, v) adev->pcie_wreg(adev, (reg), (v))
^~~~
drivers/gpu/drm/amd/amdgpu/df_v3_6.c:334:39: note: 'lo_val' was declared here
uint32_t lo_base_addr, hi_base_addr, lo_val, hi_val;
^~~~~~
Make it return a proper error code that we can catch in the caller.
Fixes: 992af942a6
("drm/amdgpu: add df perfmon regs and funcs for xgmi")
Signed-off-by: Arnd Bergmann <arnd@arndb.de >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:02:03 -05:00
Geert Uytterhoeven
b6bb56ac7d
drm/amd/display: Add missing newline at end of file
...
"git diff" says:
\ No newline at end of file
after modifying the file.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:02:03 -05:00
Prike Liang
82973e078b
drm/amd/powerplay: detect version of smu backend (v2)
...
Print the backend type.
v2: whitespace fixes (Alex)
Signed-off-by: Prike Liang <Prike.Liang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:02:03 -05:00
Oak Zeng
38bb4226ff
drm/amdkfd: Fix sdma queue allocate race condition
...
SDMA queue allocation requires the dqm lock as it modify
the global dqm members. Enclose it in the dqm_lock.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Reviewed-by: Philip Yang <philip.yang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:01:41 -05:00
Oak Zeng
6a6ef5ee25
drm/amdkfd: Fix a circular lock dependency
...
The idea to break the circular lock dependency is to temporarily drop
dqm lock before calling allocate_mqd. See callstack #1 below.
[ 59.510149] [drm] Initialized amdgpu 3.30.0 20150101 for 0000:04:00.0 on minor 0
[ 513.604034] ======================================================
[ 513.604205] WARNING: possible circular locking dependency detected
[ 513.604375] 4.18.0-kfd-root #2 Tainted: G W
[ 513.604530] ------------------------------------------------------
[ 513.604699] kswapd0/611 is trying to acquire lock:
[ 513.604840] 00000000d254022e (&dqm->lock_hidden){+.+.}, at: evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[ 513.605150]
but task is already holding lock:
[ 513.605307] 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250
[ 513.605540]
which lock already depends on the new lock.
[ 513.605747]
the existing dependency chain (in reverse order) is:
[ 513.605944]
-> #4 (&anon_vma->rwsem){++++}:
[ 513.606106] __vma_adjust+0x147/0x7f0
[ 513.606231] __split_vma+0x179/0x190
[ 513.606353] mprotect_fixup+0x217/0x260
[ 513.606553] do_mprotect_pkey+0x211/0x380
[ 513.606752] __x64_sys_mprotect+0x1b/0x20
[ 513.606954] do_syscall_64+0x50/0x1a0
[ 513.607149] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 513.607380]
-> #3 (&mapping->i_mmap_rwsem){++++}:
[ 513.607678] rmap_walk_file+0x1f0/0x280
[ 513.607887] page_referenced+0xdd/0x180
[ 513.608081] shrink_page_list+0x853/0xcb0
[ 513.608279] shrink_inactive_list+0x33b/0x700
[ 513.608483] shrink_node_memcg+0x37a/0x7f0
[ 513.608682] shrink_node+0xd8/0x490
[ 513.608869] balance_pgdat+0x18b/0x3b0
[ 513.609062] kswapd+0x203/0x5c0
[ 513.609241] kthread+0x100/0x140
[ 513.609420] ret_from_fork+0x24/0x30
[ 513.609607]
-> #2 (fs_reclaim){+.+.}:
[ 513.609883] kmem_cache_alloc_trace+0x34/0x2e0
[ 513.610093] reservation_object_reserve_shared+0x139/0x300
[ 513.610326] ttm_bo_init_reserved+0x291/0x480 [ttm]
[ 513.610567] amdgpu_bo_do_create+0x1d2/0x650 [amdgpu]
[ 513.610811] amdgpu_bo_create+0x40/0x1f0 [amdgpu]
[ 513.611041] amdgpu_bo_create_reserved+0x249/0x2d0 [amdgpu]
[ 513.611290] amdgpu_bo_create_kernel+0x12/0x70 [amdgpu]
[ 513.611584] amdgpu_ttm_init+0x2cb/0x560 [amdgpu]
[ 513.611823] gmc_v9_0_sw_init+0x400/0x750 [amdgpu]
[ 513.612491] amdgpu_device_init+0x14eb/0x1990 [amdgpu]
[ 513.612730] amdgpu_driver_load_kms+0x78/0x290 [amdgpu]
[ 513.612958] drm_dev_register+0x111/0x1a0
[ 513.613171] amdgpu_pci_probe+0x11c/0x1e0 [amdgpu]
[ 513.613389] local_pci_probe+0x3f/0x90
[ 513.613581] pci_device_probe+0x102/0x1c0
[ 513.613779] driver_probe_device+0x2a7/0x480
[ 513.613984] __driver_attach+0x10a/0x110
[ 513.614179] bus_for_each_dev+0x67/0xc0
[ 513.614372] bus_add_driver+0x1eb/0x260
[ 513.614565] driver_register+0x5b/0xe0
[ 513.614756] do_one_initcall+0xac/0x357
[ 513.614952] do_init_module+0x5b/0x213
[ 513.615145] load_module+0x2542/0x2d30
[ 513.615337] __do_sys_finit_module+0xd2/0x100
[ 513.615541] do_syscall_64+0x50/0x1a0
[ 513.615731] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 513.615963]
-> #1 (reservation_ww_class_mutex){+.+.}:
[ 513.616293] amdgpu_amdkfd_alloc_gtt_mem+0xcf/0x2c0 [amdgpu]
[ 513.616554] init_mqd+0x223/0x260 [amdgpu]
[ 513.616779] create_queue_nocpsch+0x4d9/0x600 [amdgpu]
[ 513.617031] pqm_create_queue+0x37c/0x520 [amdgpu]
[ 513.617270] kfd_ioctl_create_queue+0x2f9/0x650 [amdgpu]
[ 513.617522] kfd_ioctl+0x202/0x350 [amdgpu]
[ 513.617724] do_vfs_ioctl+0x9f/0x6c0
[ 513.617914] ksys_ioctl+0x66/0x70
[ 513.618095] __x64_sys_ioctl+0x16/0x20
[ 513.618286] do_syscall_64+0x50/0x1a0
[ 513.618476] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 513.618695]
-> #0 (&dqm->lock_hidden){+.+.}:
[ 513.618984] __mutex_lock+0x98/0x970
[ 513.619197] evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[ 513.619459] kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
[ 513.619710] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
[ 513.620103] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
[ 513.620363] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
[ 513.620614] __mmu_notifier_invalidate_range_start+0x70/0xb0
[ 513.620851] try_to_unmap_one+0x7fc/0x8f0
[ 513.621049] rmap_walk_anon+0x121/0x290
[ 513.621242] try_to_unmap+0x93/0xf0
[ 513.621428] shrink_page_list+0x606/0xcb0
[ 513.621625] shrink_inactive_list+0x33b/0x700
[ 513.621835] shrink_node_memcg+0x37a/0x7f0
[ 513.622034] shrink_node+0xd8/0x490
[ 513.622219] balance_pgdat+0x18b/0x3b0
[ 513.622410] kswapd+0x203/0x5c0
[ 513.622589] kthread+0x100/0x140
[ 513.622769] ret_from_fork+0x24/0x30
[ 513.622957]
other info that might help us debug this:
[ 513.623354] Chain exists of:
&dqm->lock_hidden --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem
[ 513.623900] Possible unsafe locking scenario:
[ 513.624189] CPU0 CPU1
[ 513.624397] ---- ----
[ 513.624594] lock(&anon_vma->rwsem);
[ 513.624771] lock(&mapping->i_mmap_rwsem);
[ 513.625020] lock(&anon_vma->rwsem);
[ 513.625253] lock(&dqm->lock_hidden);
[ 513.625433]
*** DEADLOCK ***
[ 513.625783] 3 locks held by kswapd0/611:
[ 513.625967] #0 : 00000000f14edf84 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30
[ 513.626309] #1 : 00000000961547fc (&anon_vma->rwsem){++++}, at: page_lock_anon_vma_read+0xe4/0x250
[ 513.626671] #2 : 0000000067b5cd12 (srcu){....}, at: __mmu_notifier_invalidate_range_start+0x5/0xb0
[ 513.627037]
stack backtrace:
[ 513.627292] CPU: 0 PID: 611 Comm: kswapd0 Tainted: G W 4.18.0-kfd-root #2
[ 513.627632] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 513.627990] Call Trace:
[ 513.628143] dump_stack+0x7c/0xbb
[ 513.628315] print_circular_bug.isra.37+0x21b/0x228
[ 513.628581] __lock_acquire+0xf7d/0x1470
[ 513.628782] ? unwind_next_frame+0x6c/0x4f0
[ 513.628974] ? lock_acquire+0xec/0x1e0
[ 513.629154] lock_acquire+0xec/0x1e0
[ 513.629357] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[ 513.629587] __mutex_lock+0x98/0x970
[ 513.629790] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[ 513.630047] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[ 513.630309] ? evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[ 513.630562] evict_process_queues_nocpsch+0x26/0x140 [amdgpu]
[ 513.630816] kfd_process_evict_queues+0x3b/0xb0 [amdgpu]
[ 513.631057] kgd2kfd_quiesce_mm+0x1c/0x40 [amdgpu]
[ 513.631288] amdgpu_amdkfd_evict_userptr+0x38/0x70 [amdgpu]
[ 513.631536] amdgpu_mn_invalidate_range_start_hsa+0xa6/0xc0 [amdgpu]
[ 513.632076] __mmu_notifier_invalidate_range_start+0x70/0xb0
[ 513.632299] try_to_unmap_one+0x7fc/0x8f0
[ 513.632487] ? page_lock_anon_vma_read+0x68/0x250
[ 513.632690] rmap_walk_anon+0x121/0x290
[ 513.632875] try_to_unmap+0x93/0xf0
[ 513.633050] ? page_remove_rmap+0x330/0x330
[ 513.633239] ? rcu_read_unlock+0x60/0x60
[ 513.633422] ? page_get_anon_vma+0x160/0x160
[ 513.633613] shrink_page_list+0x606/0xcb0
[ 513.633800] shrink_inactive_list+0x33b/0x700
[ 513.633997] shrink_node_memcg+0x37a/0x7f0
[ 513.634186] ? shrink_node+0xd8/0x490
[ 513.634363] shrink_node+0xd8/0x490
[ 513.634537] balance_pgdat+0x18b/0x3b0
[ 513.634718] kswapd+0x203/0x5c0
[ 513.634887] ? wait_woken+0xb0/0xb0
[ 513.635062] kthread+0x100/0x140
[ 513.635231] ? balance_pgdat+0x3b0/0x3b0
[ 513.635414] ? kthread_delayed_work_timer_fn+0x80/0x80
[ 513.635626] ret_from_fork+0x24/0x30
[ 513.636042] Evicting PASID 32768 queues
[ 513.936236] Restoring PASID 32768 queues
[ 524.708912] Evicting PASID 32768 queues
[ 524.999875] Restoring PASID 32768 queues
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Reviewed-by: Philip Yang <philip.yang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:01:41 -05:00
Oak Zeng
d091bc0a70
Revert "drm/amdkfd: Fix a circular lock dependency"
...
This reverts commit 06b89b38f3
.
This fix is not proper. allocate_mqd can't be moved before
allocate_sdma_queue as it depends on q->properties->sdma_id
set in later.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Reviewed-by: Philip Yang <philip.yang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:01:41 -05:00
Oak Zeng
70d488fb3f
Revert "drm/amdkfd: Fix sdma queue allocate race condition"
...
This reverts commit f77dac6cd6
.
This fix is not proper. allocate_mqd can't be moved before
allocate_sdma_queue as it depends on q->properties->sdma_id
set in later.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com >
Reviewed-by: Philip Yang <philip.yang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-06-17 11:01:41 -05:00