Decauple sched threads stop and start and ring mirror
list handling from the policy of what to do about the
guilty jobs.
When stoppping the sched thread and detaching sched fences
from non signaled HW fenes wait for all signaled HW fences
to complete before rerunning the jobs.
v2: Fix resubmission of guilty job into HW after refactoring.
v4:
Full restart for all the jobs, not only from guilty ring.
Extract karma increase into standalone function.
v5:
Rework waiting for signaled jobs without relying on the job
struct itself as those might already be freed for non 'guilty'
job's schedulers.
Expose karma increase to drivers.
v6:
Use list_for_each_entry_safe_continue and drm_sched_process_job
in case fence already signaled.
Call drm_sched_increase_karma only once for amdgpu and add documentation.
v7:
Wait only for the latest job's fence.
Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The power smu7 powerplay code is much more robust and has
been the default for a while now. Remove the old code.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing power_average to visible check for power
attributes for APUs. Was missed before.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace the last bool type parameter with a general flags parameter,
to make the last parameter be able to contain more information.
v2: drop setting need_ctx_switch = false
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sriov would meet guest driver load failure,
if calling amdgpu_asic_reset in amdgpu_device_init.
sriov should skip asic_reset in device_init.
Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The entries are ignored for now, but it at least stops crashing the
hardware when somebody tries to push something to the other IH rings.
v2: limit ring size, add TODO comment
v3: only program rings if they are actually allocated
v4: limit the ring init to Vega10
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Otherwise we run into a non-retry fault on access.
It seems to be a hardware bug that the executable bit has
higher priority than the valid bit.
v2: handle clears as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
User can use "pp_dpm_dcefclk" to retrieve and adjust dcefclock power
levels.
V2: expose this interface for Vega10 and later ASICs only
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
User can use "pp_dpm_fclk" to retrieve and adjust fclock power
levels.
V2: expose this interface for Vega20 and later ASICs only
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
User can use "pp_dpm_socclk" to retrieve and adjust SOC clock power
levels.
V2: expose this interface for Vega10 and later ASICs only
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
User can use "ppfeatures" sysfs interface to retrieve and set enabled
powerplay features.
V2: expose this feature for Vega10 and later dGPUs
V3: squash in removal of unused variable (Alex)
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In some cases, psp response status is not 0 even there is no
problem while the command is submitted. Some version of PSP FW
doesn't write 0 to that field.
So here we would like to only print a warning instead of an error
during psp initialization to avoid breaking hw_init and it doesn't
return -EINVAL.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Xiangliang Yu<Xiangliang.Yu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Paul Menzel <pmenzel+amd-gfx@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
HW doorbell writing routing policy: writing to doorbell
not in SDMA/IH/MM/ACV doorbell range will be routed to CP.
So CP doorbell routing depends on doorbell range setting
of above blocks. Setting doorbell range of above blocks
earlier (soc15_common_hw_init) to make sure CP doorbell
writing be routed to CP block.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Different ASIC has different SDMA queue number so
different SDMA doorbell range. Introduce an extra
parameter to sdma_doorbell_range function and set
sdma doorbell range correctly.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Different ASIC has different sdma doorbell range. Add
a per device sdma_doorbell_range field and initialize
it.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SOC15 chips require a reset if the driver was previously loaded
because the PSP can only be loaded once between each reset.
v2: rebase, handle multiple asic funcs
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
VI chips require a reset if the driver was previously loaded
because the SMU can only be loaded once between each reset.
v2: rebase
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CIK chips require a reset if the driver was previously loaded
because the SMU can only be loaded once between each reset.
v2: rebase
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Used to determine if we need to reset the asic on init due
to the driver having been previously loaded or not shutdown
cleanly. E.g., kexec or VM passthrough.
v2: rebase
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Expose sclk (gfx clock) and mclk (memory clock) via
hwmon compatible interface. hwmon does not actually
formally specify a frequency type attribute, but these
are compatible with the format of the other attributes
exposed via hwmon. Units are hertz.
freq1_input - GPU gfx/compute clock in hertz
freq2_input - GPU memory clock in hertz (dGPU only)
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a sysfs file that reports the number of bytes transmitted and
received in the last second. This can be used to approximate the PCIe
bandwidth usage over the last second.
v2: Clarify use of mps as estimation of bandwidth
v3: Don't make the file on APUs
v4: Early exit for APUs in the read function, change output to
display "packets-received packets-sent mps"
v5: fix missing header for si (Alex)
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need these offsets for PCIE perf counters, so include them as well as
the the previously-used defines from the nbio_*.c files
v2: Return NBIF definitions back to previous files
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: Move locks around in other functions so that this
function can stand on its own. Also only hold the hive
specific lock for add/remove device instead of the driver
global lock so you can't add/remove devices in parallel from
one hive.
v3: add reset_lock
Acked-by: Shaoyun.liu < Shaoyun.liu@amd.com>
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix boolean expressions by using logical AND operator '&&'
instead of bitwise operator '&'.
This issue was detected with the help of Coccinelle.
Fixes: 9a4b7d4c76 ("drm/amdgpu: Add vm context module param")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When an ring buffer overflow happens the appropriate bit is set in the WPTR
register which is also written back to memory. But clearing the bit in the
WPTR doesn't trigger another memory writeback.
So what can happen is that we end up processing the buffer overflow over and
over again because the bit is never cleared. Resulting in a random system
lockup because of an infinite loop in an interrupt handler.
This is 100% reproducible on Vega10, but it's most likely an issue we have
in the driver over all generations all the way back to radeon.
v2: rebase
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Initializing structures with { } is known to be problematic since
it doesn't necessararily initialize all bytes, in case of padding,
causing random failures when structures are memcmp().
This patch fixes the structure initialisation related compiler
error by memset().
V2: rectified missing piece in coding
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There isn't ucode when executing INVOKE command, so current code can't
check the failure of INVOKE command.
Remove the ucode check.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Driver get session id after loading TA FW and the session id is used
by driver instances to communicate with TA. PF and VF have different
session id.
xGMI session id should get from response buffer, correct it.
Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>