If we are asked to submit a completed request, just move it onto the
active-list without modifying it's payload. If we try to emit the
modified payload of a completed request, we risk racing with the
ring->head update during retirement which may advance the head past our
breadcrumb and so we generate a warning for the emission being behind
the RING_HEAD.
v2: Commentary for the sneaky, shared responsibility between functions.
v3: Spelling mistakes and bonus assertion
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190923110056.15176-3-chris@chris-wilson.co.uk
Gamma lut programming can be programmed using DSB
where bulk register programming can be done using indexed
register write which takes number of data and the mmio offset
to be written.
Currently enabled for 12-bit gamma LUT which is enabled by
default and later 8-bit/10-bit will be enabled in future
based on need.
v1: Initial version.
v2: Directly call dsb-api at callsites. (Jani)
v3:
- modified the code as per single dsb instance per crtc. (Shashank)
- Added dsb get/put call in platform specific load_lut hook. (Jani)
- removed dsb pointer from dev_priv. (Jani)
v4: simplified code by dropping ref-count implementation. (Shashank)
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-9-animesh.manna@intel.com
This patch adds a function, which will internally get the gem buffer
for DSB engine. The GEM buffer is from global GTT, and is mapped into
CPU domain, contains the data + opcode to be feed to DSB engine.
v1: Initial version.
v2:
- removed some unwanted code. (Chris)
- Used i915_gem_object_create_internal instead of _shmem. (Chris)
- cmd_buf_tail removed and can be derived through vma object. (Chris)
v3: vma realeased if i915_gem_object_pin_map() failed. (Shashank)
v4: for simplification and based on current usage added single dsb
object in intel_crtc. (Shashank)
v5: seting NULL to cmd_buf moved outside of mutex in dsb-put(). (Shashank)
v6:
- refcount machanism added.
- Used atomic_add_return and atomic_dec_and_test instead of
atomic_inc and atomic_dec. (Jani)
Cc: Imre Deak <imre.deak@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
[Jani: added #include <linux/types.h> while pushing]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190920115930.27829-3-animesh.manna@intel.com
For icl+, have hw read out to create hw blob of gamma
lut values. icl+ platforms supports multi segmented gamma
mode by default, add hw lut creation for this mode.
This will be used to validate gamma programming using dsb
(display state buffer) which is a tgl specific feature.
Major change done-removal of readouts of coarse and fine segments
because PAL_PREC_DATA register isn't giving propoer values.
State checker limited only to "fine segment"
v2: -readout code for multisegmented gamma has to come
up with some intermediate entries that aren't preserved
in hardware (Jani N)
-linear interpolation (Ville)
-moved common code to check gamma_enable to specific funcs,
since icl doesn't support that
v3: -use u16 instead of __u16 [Jani N]
-used single lut [Jani N]
-improved and more readable for loops [Jani N]
-read values directly to actual locations and then fill gaps [Jani N]
-moved cleaning to patch 1 [Jani N]
-renamed icl_read_lut_multi_seg() to icl_read_lut_multi_segment to
make it similar to icl_load_luts()
-renamed icl_compute_interpolated_gamma_blob() to
icl_compute_interpolated_gamma_lut_values() more sensible, I guess
v4: -removed interpolated func for creating gamma lut values
-removed readouts of fine and coarse segments, failure to read PAL_PREC_DATA
correctly
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1569096654-24433-3-git-send-email-swati2.sharma@intel.com
On ILK-IVB the pipe colorspace is configured via PIPECONF
(as opposed to PIPEMISC in BDW+). Let's configure+readout
that stuff correctly.
Enabling YCbCr 4:4:4 output will now be a simple matter of
setting crtc_state->output_format appropriately in the encoder
.compute_config(). However, when we do that we must be
aware of the fact that YCbCr DP output doesn't seem to work
on ILK (resulting image is totally garbled), but on SNB+
it works fine. However HDMI YCbCr output does work correctly
even on ILK.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-13-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Prepare the pipe csc for YCbCr output on ilk/snb. The main difference
to IVB+ is the lack of explicit post offsets, and instead we must
configure the CSC info RGB->YUV mode (which takes care of offsetting
Cb/Cr properly) and enable the "black screen offset" bit to add the
required offset to Y.
And while at it throw some comments around the bit defines to
document which platforms have which bits.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-12-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Since HSW the PIPECONF progressive vs. interlaced selection is done
with just two bits instead of the earlier three. Let's not look at the
extra bit on HSW+. Also gen2 doesn't support interlaced displays at all.
This is actually fine as is currently because the extra bit is mbz (as
are all three bits on gen2). But just to avoid mishaps in the future
if the bits get reused let's only look at what's properly defined.
v2: constify crtc_state
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-8-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
crtc_state->limited_color_range only applies to RGB output but
we're currently setting it even for YCbCr output. That will
lead to conflicting MSA and PIPECONF settings which can mess
up the image. Let's make sure limited_color_range stays unset
with YCbCr output.
Also WARN if we end up with such a bogus combination when
programming the MSA MISC bits as it's impossible to even
indicate quantization rangle for YCbCr via MSA MISC. YCbCr
output is simply assumed to be limited range always. Note
that VSC SDP does provide a mechanism for full range YCbCr,
so in the future we may want to rethink how we compute/store
this state.
And for good measure we add the same WARN to the HDMI path.
v2: s/==/!=/ in the HDMI WARN
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718164523.11738-1-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Make both GuC and HuC to use "." as the separator. Hardcode
the separator in MAKE_UC_FW_PATH. Remove the usage of "ver" from HuC.
The current convention being:
<platform>_<g/h>uc_<major>.<minor>.patch.bin
Update the versions of HuC being loaded of the platforms.
SKL - v2.0.0
BXT - v2.0.0
KBL - v4.0.0
GLK - v4.0.0
CFL - KBL v4.0.0
ICL - v9.0.0
CML - v4.0.0
v2: Remove the separator parameter altogether from
__MAKE_UC_FW_PATH.(Daniele)
- Squash all firmware update patches (Daniele)
v3: s/huc/HuC
- Correct the order of platforms
- Change REVID of cml to 5(Michal)
- Code space changes in huc_def (Daniele)
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919201204.9691-1-daniele.ceraolospurio@intel.com
As not only is the signal->timeline volatile, so will be acquiring the
timeline's HWSP. We must first carefully acquire the timeline from the
signaling request and then lock the timeline. With the removal of the
struct_mutex serialisation of request construction, we can have multiple
timelines active at once, and so we must avoid using the nested mutex
lock as it is quite possible for both timelines to be establishing
semaphores on the other and so deadlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-3-chris@chris-wilson.co.uk
As we need to take a walk back along the signaler timeline to find the
fence before upon which we want to wait, we need to lock that timeline
to prevent it being modified as we walk. Similarly, we also need to
acquire a reference to the earlier fence while it still exists!
Though we lack the correct locking today, we are saved by the
overarching struct_mutex -- but that protection is being removed.
v2: Tvrtko made me realise I was being lax and using annotations to
ignore the AB-BA deadlock from the timeline overlap. As it would be
possible to construct a second request that was using a semaphore from the
same timeline as ourselves, we could quite easily end up in a situation
where we deadlocked in our mutex waits. Avoid that by using a trylock
and falling back to a normal dma-fence await if contended.
v3: Eek, the signal->timeline is volatile and must be carefully
dereferenced to ensure it is valid.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-2-chris@chris-wilson.co.uk
The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.
One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.
v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
Before we execute a batch, we must first issue any and all TLB
invalidations so that batch picks up the new page table entries.
Tigerlake's preparser is weakening our post-sync CS_STALL inside the
invalidate pipe-control and allowing the loading of the batch buffer
before we have setup its page table (and so it loads the wrong page and
executes indefinitely).
The igt_cs_tlb indicates that this issue can only be observed on rcs,
even though the preparser is common to all engines. Alternatively, we
could do TLB shootdown via mmio on updating the GTT.
By inserting the pre-parser disable inside EMIT_INVALIDATE, we will also
accidentally fixup execution that writes into subsequent batches, such
as gem_exec_whisper and even relocations performed on the GPU. We should
be careful not to allow this disable to become baked into the uABI! The
issue is that if userspace relies on our disabling of the HW
optimisation, when we are ready to enable that optimisation, userspace
will then be broken...
Testcase: igt/i915_selftests/live_gtt/igt_cs_tlb
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111753
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919151811.9526-1-chris@chris-wilson.co.uk
Modern platforms allow the transcoders hdisplay/vdisplay to exceed the
planes' max resolution. This has the nasty implication that modes on the
connectors' mode list may not be usable when the user asks for a
fullscreen plane. Seeing as that is the most common use case it seems
prudent to filter out modes that don't allow for fullscreen planes to
be enabled.
Let's do that in the connetor .mode_valid() hook so that normally
such modes are kept hidden but the user is still able to forcibly
specify such a mode if they know they don't need fullscreen planes.
This is in line with ealier policies regarding certain clock limits.
The idea is to prevent the casual user from encountering a mode that
would fail under typical conditions, but allow the expert user to
force things if they so wish.
Maybe in the future we should consider automagically using two
planes when one can't cover the entire screen? Wouldn't be a
great match for the current uapi with explicit planes though,
but I guess no worse than using two pipes (which we apparently
have to in the future anyway). Either that or we'd have to
teach userspace to do it for us.
v2: Fix icl+ max plane heigth (Manasi)
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Leho Kraav <leho@kraav.com>
Cc: Sean Paul <sean@poorly.run>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190918150707.32420-1-ville.syrjala@linux.intel.com
The officially validated plane width limit is 4k on skl+, however
we already had people using 5k displays before we started to enforce
the limit. Also it seems Windows allows 5k resolutions as well
(though not sure if they do it with one plane or two).
According to hw folks 5k should work with the possible
exception of the following features:
- Ytile (already limited to 4k)
- FP16 (already limited to 4k)
- render compression (already limited to 4k)
- KVMR sprite and cursor (don't care)
- horizontal panning (need to verify this)
- pipe and plane scaling (need to verify this)
So apart from last two items on that list we are already
fine. We should really verify what happens with those last
two items but I don't have a 5k display on hand atm so it'll
have to wait.
In the meantime let's just bump the limit back up to 5k since
several users have already been using it without apparent issues.
At least we'll be no worse off than we were prior to lowering
the limits.
Cc: stable@vger.kernel.org
Cc: Sean Paul <sean@poorly.run>
Cc: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Leho Kraav <leho@kraav.com>
Fixes: 372b9ffb57 ("drm/i915: Fix skl+ max plane width")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111501
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190905135044.2001-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Sean Paul <sean@poorly.run>
We generally assume future platforms will inherit the behavior of the
most recent platforms, so update our DDC pin mapping defaults to match
how ICP/TGP behave (i.e., pins starting from GMBUS_PIN_1_BXT for combo
PHY's and pins starting from GMBUS_PIN_9_TC1_ICP for TC PHY's). MCC's
non-standard handling of combo PHY C seems like a platform-specific
quirk that is unlikely to be duplicated on future platforms, so continue
handling it as a special case.
Without this change, future platforms would default to gen4-style pin
mapping which is almost certainly not what we'll want.
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190918235626.3750-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>