drm/amdgpu: update athub interrupt harvesting handle

GCEA/MMHUB EA error should not result to DF freeze, this is
fixed in next generation, but for some reasons the GCEA/MMHUB
EA error will result to DF freeze in previous generation,
diver should avoid to indicate GCEA/MMHUB EA error as hw fatal
error in kernel message by read GCEA/MMHUB err status registers.

Changed from V1:
    make query_ras_error_status function more general
    make read mmhub er status register more friendly

Changed from V2:
    move ras error status query function into do_recovery workqueue

Changed from V3:
    remove useless code from V2, print GCEA error status
    instance number

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is contained in:
Stanley.Yang
2020-09-15 16:15:05 +08:00
committed by Alex Deucher
parent d117413f5e
commit 3f975d0f71
8 changed files with 108 additions and 2 deletions

View File

@@ -205,6 +205,8 @@
#define mmGCEA_EDC_CNT2_BASE_IDX 0
#define mmGCEA_EDC_CNT3 0x071b
#define mmGCEA_EDC_CNT3_BASE_IDX 0
#define mmGCEA_ERR_STATUS 0x0712
#define mmGCEA_ERR_STATUS_BASE_IDX 0
// addressBlock: gc_gfxudec
// base address: 0x30000
@@ -261,4 +263,4 @@
#define mmRLC_EDC_CNT2 0x4d41
#define mmRLC_EDC_CNT2_BASE_IDX 1
#endif
#endif