Even though this is required for degamma since DCE HW only supports a
couple predefined LUTs we can just program the LUT directly for regamma.
This fixes dark screens which occurs when we program regamma to bypass
while degamma is using srgb LUT.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Only accept interrupts from KFD VMIDs. Just checking for a PASID may
not be enough because amdgpu started using PASIDs to map VM faults
to processes.
Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug).
Suggested-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Suggested-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
HWS may hang in the middle of destroy queue, remove the queue from the
process queue list so it won't be freed again in the future
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The initialization is not necessary. amd-kfd-staging and ROCm
releases have worked without it for two years.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
It turns out that idr_for_each_entry is really slow compared to just
iterating over the slots. Based on measurements the difference is
estimated to be about a factor 64. That means using idr_for_each_entry
is only worth it with very few allocated events.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
The problem happens on Raven and Carrizo. The context save handler
should not clear the high bits of PC_HI before extracting the bits
of IB_STS.
The bug is not relevant to VEGA10 until we enable demand paging.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Since the assembly code is inside "#if 0", it is ineffective. Despite that,
during debugging, we need to change the assembly code, extract it into
a separate file and compile the new file into hex values using sp3.
That process also requires us to remove "#if 0" and modify lines starting
with "#", so that sp3 can successfully compile the new file.
With this change, all the above chore is no longer needed, and
cwsr_trap_handler_gfx*.asm can be directly used by sp3 to generate its
hex values.
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
MTYPE_NC_NV (0) marks scalar/vector L1 cache lines as non-volatile.
Cache lines loaded through these apertures are intended to be
invalidated before (and sometimes during) a dispatch. The non-volatile
qualifier prevents these cache lines from being distinguished from
those loaded through the private aperture.
Use MTYPE_NC (1) instead on both Gfx7 and Gfx8. This allows the
compiler to use the BUFFER_WBINVL1_VOL instruction and is a precursor
to automatic per-dispatch scalar/vector L1 volatile invalidation.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Synchronization between context-saving wavefronts is achieved by
sending a SAVEWAVE message to the SPI and then spin-waiting for a
response. These spin-waiting wavefronts may inhibit the progress
of other wavefronts in the context save handler, leading to the
synchronization condition never being achieved.
Before spin-waiting reduce the priority of each wavefront to
guarantee foward progress in the others.
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
args->n_devices is a u32 that comes from the user. The multiplication
could overflow on 32 bit systems possibly leading to privilege
escalation.
Fixes: 5ec7e02854 ("drm/amdkfd: Add ioctls for GPUVM memory management")
Signed-off-by: Dan Carpenter dan.carpenter@oracle.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Pull drm fixes from Dave Airlie:
"One omap, and one alsa pm fix (we merged the breaking patch via drm
tree).
Otherwise it's two bunches of amdgpu fixes, removing an unneeded file,
some DC fixes, HDMI audio regression fix, and some vega12 fixes"
* tag 'drm-fixes-for-v4.17-rc1' of git://people.freedesktop.org/~airlied/linux: (27 commits)
Revert "drm/amd/display: disable CRTCs with NULL FB on their primary plane (V2)"
Revert "drm/amd/display: fix dereferencing possible ERR_PTR()"
drm/amd/display: Fix regamma not affecting full-intensity color values
drm/amd/display: Fix FBC text console corruption
drm/amd/display: Only register backlight device if embedded panel connected
drm/amd/display: fix brightness level after resume from suspend
drm/amd/display: HDMI has no sound after Panel power off/on
drm/amdgpu: add MP1 and THM hw ip base reg offset
drm/amdgpu: fix null pointer panic with direct fw loading on gpu reset
drm/radeon: add PX quirk for Asus K73TK
drm/omap: fix crash if there's no video PLL
drm/amdgpu: Fix memory leaks at amdgpu_init() error path
drm/amdgpu: Fix PCIe lane width calculation
drm/radeon: Fix PCIe lane width calculation
drm/amdgpu/si: implement get/set pcie_lanes asic callback
drm/amdgpu: Add support for SRBM selection v3
Revert "drm/amdgpu: Don't change preferred domian when fallback GTT v5"
drm/amd/powerply: fix power reading on Fiji
drm/amd/powerplay: Enable ACG SS feature
drm/amdgpu/sdma: fix mask in emit_pipeline_sync
...
Hardware understands the regamma LUT as a piecewise linear function,
with points spaced exponentially along the range. We previously
programmed the LUT for range [2^-10, 2^0). This causes (normalized)
color values of 1 (=2^0) to miss the programmed LUT, and fall onto the
end region.
For DCE, the end region is extrapolated using a single (base, slope)
pair, using the max y-value from the last point in the curve as base.
This presents a problem, since this value affects all three color
channels. Scaling down the intensity of say - the blue regamma curve -
will not affect it's end region. This is especially noticiable when
using RedShift. It scales down the blue and green channels, but leaves
full-intensity colors unshifted.
Therefore, extend the range to cover [2^-10, 2^1) by programming another
hardware segment, containing only one point. That way, we won't be
hitting the end region.
Note that things are a bit different for DCN, since the end region can
be set per-channel.
Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Instead of user set brightness with range of percentage,
HLK test set brightness level with range of normal, this will result in
HLK test case set brightness from 0 to 255, DC set brightness with ramp is 0,
and disabled ramp change which will fail the HLK test.
Fix:
In case of unblank stream and turn on edp, change brightness level in
stream to 0xFFFFFFFF(actural maximum level is 0xFF), use that value as
a flag to recogonize this the case of resume from S3.
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move DCN1 implementation of stream encoder to new file (instead
of common dce_stream_encoder.c).
Cleanup code related to different implementation due to register
definition differences.
Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To prevent future optimization related bugs, just set all update
flags when we have a full update, since we know we want to reprogram
everything in that case.
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As per eDP 1.4 spec, there must be at least 500ms delay
between eDP power off and on.
This change added time stamp when edp power off, which can
be used to calculate duration time when edp power on.
If duration less than 500ms, add a wait.
Signed-off-by: Charlene Liu <charlene.liu@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We're an atomic driver and shouldn't access legacy properties. Doing so
will only scare users with stack traces.
Instead save the prop in the state and access it directly. Much simpler.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>