front end is initialized during init_hw, but not
power gated. There are some left over valuse and will
cause some diags test failed. Power gated all front
end pipes will make sure every test has same starting
point.
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The GTT manager handles the GART address space anyway, so it is
completely pointless to keep the same information around twice.
v2: rebased
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename amdgpu_gtt_mgr_is_allocated() to amdgpu_gtt_mgr_has_gart_addr() and use
that instead.
v2: rename the function as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1,no sriov check since gpu recover is unified
2,need CPU_ACCESS_REQUIRED flag for VRAM if SRIOV
because otherwise after following PIN the first allocated
VRAM bo is wasted due to some TTM mgr reason.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
since now gpu reset is unified with gpu_recover
for both bare-metal and SR-IOV:
1)rename in_sriov_reset to in_gpu_reset
2)move lock_reset from adev->virt to adev
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1,new imple names amdgpu_gpu_recover which gives more hint
on what it does compared with gpu_reset
2,gpu_recover unify bare-metal and SR-IOV, only the asic reset
part is implemented differently
3,gpu_recover will increase hang job karma and mark its entity/context
as guilty if exceeds limit
V2:
4,in scheduler main routine the job from guilty context will be immedialy
fake signaled after it poped from queue and its fence be set with
"-ECANCELED" error
5,in scheduler recovery routine all jobs from the guilty entity would be
dropped
6,in run_job() routine the real IB submission would be skipped if @skip parameter
equales true or there was VRAM lost occured.
V3:
7,replace deprecated gpu reset, use new gpu recover
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
jobs are skipped under two cases
1)when the entity behind this job marked guilty, the job
poped from this entity's queue will be dropped in sched_main loop.
2)in job_recovery(), skip the scheduling job if its karma detected
above limit, and also skipped as well for other jobs sharing the
same fence context. this approach is becuase job_recovery() cannot
access job->entity due to entity may already dead.
v2:
some logic fix
v3:
when entity detected guilty, don't drop the job in the poping
stage, instead set its fence error as -ECANCELED
in run_job(), skip the scheduling either:1) fence->error < 0
or 2) there was a VRAM LOST occurred on this job.
this way we can unify the job skipping logic.
with this feature we can introduce new gpu recover feature.
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is a follow-up patch to:
Leo (Sunpeng) Li Cleanup code that enables freesync
We should only add a stream to freesync if a modeset was requested, so
we don't fill the core freesync map with pointless streams.
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Changes to atomic set property for freesync.
Now In set property, just set the freesync variables and return 0.
Based on the variables call mod_freesync_set_user_enable() inside
commit_tail
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't currently expose variable refresh rate, but add the state to the
connector state to make it easier to maintain the support for it from a
hw support perspective while we figure out the uapi for drm.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the programming sequence to allow DMCU firmware to be loaded by
PSP. This code detects whether the firmware is loaded and does a check
to verify the expected interface version and checks for correct response
from micro controller.
Added registry key method to allow force loading of firmware from kernel
mode driver for test purposes. This is old method of firmware loading
without PSP.
Moved some init sequences into dc/dmcu.
Changed loading sequence to initialize IRAM after firmware completely
loaded. Firmware will now disable features that use IRAM
until initialized.
Signed-off-by: Anthony Koo <anthony.koo@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Program cursor attributes during set mode is only needed
in case of pipe slipt, bottom pipe should be programmed
same as top pipe. Need to program if address is 0.
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
move powerplay and amdgpu shared structures
and definitions to kgd_pp_interface.h. This
is the interface between the base driver
and powerplay.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The exclusive mode has real-time limitation in reality, such like being
done in 300ms. It's easy observed if running many VF/VMs in single host
with heavy CPU workload.
If we find the init fails due to exclusive mode timeout, try it again.
v2:
- rewrite the condition for readable value.
v3:
- fix typo, add comments for sleep
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: pding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Driver can use this interface to check if there's a function level
reset done in hypervisor. It's helpful when IRQ handler for reset
is not ready, or special handling is required.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: pding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MMIO space can be blocked on virtualised device. Add this
function to check if MMIO is blocked or not.
Todo: need a reliable method such like communation
with hypervisor.
v2:
- add comments inline
Signed-off-by: pding <Pixel.Ding@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Normally all waiting get timeout if there's one.
Release the lock and return immediately when timeout happens.
v2:
- set the se_sh to broadcase before return
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: pding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When this VF stays in exclusive mode for long, other VFs will be
impacted.
The redundant messages causes exclusive mode timeout when they're
redirected. That is a normal use case for cloud service to redirect
guest log to virtual serial port.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: pding <Pixel.Ding@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>