Having completed a test run of gem_eio across all machines in CI we also
observe the phenomenon (of lost interrupts after resetting the GPU) on
gen3 machines as well as the previously sighted gen6/gen7. Let's apply
the same HWSTAM workaround that was effective for gen6+ for all, as
although we haven't seen the same failure on gen4/5 it seems prudent to
keep the code the same.
As a consequence we can remove the extra setting of HWSTAM and apply the
register from a single site.
v2: Delazy and move the HWSTAM into its own function
v3: Mask off all HWSP writes on driver unload and engine cleanup.
v4: And what about the physical hwsp?
v5: No, engine->init_hw() is not called from driver_init_hw(), don't be
daft. Really scrub HWSTAM as early as we can in driver_init_mmio()
v6: Rename set_hwsp as it was setting the mask not the hwsp register.
v7: Ville pointed out that although vcs(bsd) was introduced for g4x/ilk,
per-engine HWSTAM was not introduced until gen6!
References: https://bugs.freedesktop.org/show_bug.cgi?id=108735
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181218102712.11058-1-chris@chris-wilson.co.uk
When the pipe_config's update_pipe flag is set we may need to update the
panel fitting settings. On GEN9+ this means we need to update the crtc's
scaler settings.
This fixes the following WARN_ON, during i915 loading on an Asrock
B150M Pro4S/D3 board with an i5-6500 CPU / graphics:
[drm:pipe_config_err [i915]] *ERROR* mismatch in pch_pfit.enabled
(expected no, found yes)
pipe state doesn't match!
WARNING: CPU: 3 PID: 305 at drivers/gpu/drm/i915/intel_display.c:12084
With line 12084 being the I915_STATE_WARN call inside the
"if (!intel_pipe_config_compare())" block in verify_crtc_state().
On this board with 2 1920x1080 monitors connected over HDMI the GOP
initializes both monitors at 1920x1080 and despite no scaling being
necessary configures a scaler for one of them.
When booting with fastboot=1 on the initial modeset needs_modeset will
be false while update_pipe is true. Since we were not calling
skl_update_scaler_crtc() in this case we would leave the scaler enabled
causing this error.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181217141903.4182-1-hdegoede@redhat.com
DSC can be supported per DP connector. This patch adds a per connector
debugfs node to expose DSC support capability by the kernel.
The same node can be used from userspace to force DSC enable.
force_dsc_en written through this debugfs node is used to force
DSC even for lower resolutions.
Credits to Ville Syrjala for suggesting the proper locks to be used
and to Lyude Paul for explaining how to use them in this context
v8:
* Add else if (ret) for drm_modeset_lock (Lyude)
v7:
* Get crtc, crtc_state from connector atomic state
and add proper locks and backoff (Ville, Chris Wilson, Lyude)
(Suggested-by: Ville Syrjala <ville.syrjala@linux.intel.com>)
* Use %zu for printing size_t variable (Lyude)
v6:
* Read fec_capable only for non edp (Manasi)
v5:
* Name it dsc sink support and also add
fec support in the same node (Ville)
v4:
* Add missed connector_status check (Manasi)
* Create i915_dsc_support node only for Gen >=10 (manasi)
* Access intel_dp->dsc_dpcd only if its not NULL (Manasi)
v3:
* Combine Force_dsc_en with this patch (Ville)
v2:
* Use kstrtobool_from_user to avoid explicit error checking (Lyude)
* Rebase on drm-tip (Manasi)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181206005407.4698-1-manasi.d.navare@intel.com
SFC (Scaler & Format Converter) units are shared between VD and VEBoxes.
They also happen to have separate reset bits. So, whenever we want to reset
one or more of the media engines, we have to make sure the SFCs do not
change owner in the process and, if this owner happens to be one of the
engines being reset, we need to reset the SFC as well.
This happens in 4 steps:
1) Tell the engine that a software reset is going to happen. The engine
will then try to force lock the SFC (if currently locked, it will
remain so; if currently unlocked, it will ignore this and all new lock
requests).
2) Poll the ack bit to make sure the hardware has received the forced
lock from the driver. Once this bit is set, it indicates SFC status
(lock or unlock) will not change anymore (until we tell the engine it
is safe to unlock again).
3) Check the usage bit to see if the SFC has ended up being locked to
the engine we want to reset. If this is the case, we have to reset
the SFC as well.
4) Unlock all the SFCs once the reset sequence is completed.
Obviously, if we are resetting the whole GPU, we don't have to worry
about all of this.
BSpec: 10989
BSpec: 10990
BSpec: 10954
BSpec: 10955
BSpec: 10956
BSpec: 19212
Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181213091522.2926-4-chris@chris-wilson.co.uk
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.
The following spatch was used to convert the users of these macros:
@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)
v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
using the bitmask
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
The DDB allocation algorithm currently used by the driver grants each
plane a very small minimum allocation of DDB blocks and then divies up
all of the remaining blocks based on the percentage of the total data
rate that the plane makes up. It turns out that this proportional
allocation approach is overly-generous with the larger planes and can
leave very small planes wthout a big enough allocation to even hit their
level 0 watermark requirements (especially on APL, which has a smaller
DDB in general than other gen9 platforms). Or there can be situations
where the smallest planes hit a lower watermark level than they should
have been able to hit with a more equitable division of DDB blocks, thus
limiting the overall system sleep state that can be achieved.
The bspec now describes an alternate algorithm that can be used to
overcome these types of issues. With the new algorithm, we calculate
all plane watermark values for all wm levels first, then go back and
partition a pipe's DDB space second. The DDB allocation will calculate
what the highest watermark level that can be achieved on *all* active
planes, and then grant the blocks necessary to hit that level to each
plane. Any remaining blocks are then divided up proportionally
according to data rate, similar to the old algorithm.
There was a previous attempt to implement this algorithm a couple years
ago in bb9d85f6e9 ("drm/i915/skl: New ddb allocation algorithm"), but
some regressions were reported, the patch was reverted, and nobody
ever got around to figuring out exactly where the bug was in that
version. Our watermark code has evolved significantly in the meantime,
but we're still getting bug reports caused by the unfair proportional
algorithm, so let's give this another shot.
v2:
- Make sure cursor allocation stays constant and fixed at the end of
the pipe allocation.
- Fix some watermark level iterators that weren't handling the max
level.
v3:
- Ensure we don't leave any DDB blocks unused by using DIV_ROUND_UP+min
to calculate the extra blocks for each plane. (Ville)
- Replace a while() loop with a for() loop to be more consistent with
surrounding code. (Ville)
- Clean unattainable watermark levels with memset rather than directly
clearing the member fields. Also do the same for the transition
watermark values if they can't be achieved. (Ville)
- Drop min_disp_buf_needed calculations in skl_compute_plane_wm() since
the results are no longer needed or used. (Ville)
- Drop skl_latency[0] != 0 sanity check; both watermark methods already
account for an invalid 0 latency by returning FP_16_16_MAX. (Ville)
v4:
- Break DDB allocation loop when total_data_rate=0 rather than
alloc_size=0. If total_data_rate has dropped to 0, all remaining
planes are disabled, which isn't true for alloc_size (we might just
have not had any remaining blocks to hand out). Plus
total_data_rate=0 is the case we need to avoid to a prevent a
div-by-0. (Ville)
- s/DIV_ROUND_UP/DIV64_U64_ROUND_UP/ to prevent 32-bit breakage (Ville)
v5:
- Don't forget to move 'start' pointer forward for UV surface when
setting plane DDB boundaries. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105458
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181211173107.11068-2-matthew.d.roper@intel.com
The bspec gives an if/else chain for choosing whether to use "method 1"
or "method 2" for calculating the watermark "Selected Result Blocks"
value for a plane. One of the branches of the if chain is:
"Else If ('plane buffer allocation' is known and (plane buffer
allocation / plane blocks per line) >=1)"
Since our driver currently calculates DDB allocations first and the
actual watermark values second, the plane buffer allocation is known at
this point in our code and we include this test in our driver's logic.
However we plan to soon move to a "watermarks first, ddb allocation
second" sequence where we won't know the DDB allocation at this point.
Let's drop this arm of the if/else statement (effectively considering
the DDB allocation unknown) as an independent patch so that any
regressions can be more accurately bisected to either the different
watermark value (in this patch) or the new DDB allocation (in the next
patch).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181211173107.11068-1-matthew.d.roper@intel.com
Try to be more consistent about intel_* types rather than drm_* types
for lower-level driver functions. While we're at it, let's also be more
consistent with state variable naming (half of the platforms use the
name 'state' whereas the other half used 'crtc_state').
While we're touching these variables, let's also be more consistent
about always naming the intel_crtc_state's "crtc_state" rather than
"state" so that different platform types aren't using different naming
conventions.
v2:
- s/state/crtc_state/ for consistency between platform types (Ville)
- Drop the crtc parameter to intel_color_check(); we can just pull that
out of the state object.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181210215415.19854-2-matthew.d.roper@intel.com
The parameter 'void *gvt' is not used and required for hypervisor's
exit call. Even for non-merged Xen hypervisor support. So just remove it.
Reviewed-by: Yuan, Hang <hang.yuan@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Don't mark hypervisor module's host_init as optional,
but mandatory required.
Reviewed-by: Yuan, Hang <hang.yuan@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Adding an extra MI_STORE_DWORD_IMM to the gpu relocation path for gen3
was good, but still not good enough. To survive 24+ hours under test we
needed to perform not one, not two but three extra store-dw. Doing so
for each GPU relocation was a little unsightly and since we need to
worry about userspace hitting the same issues, we should apply the dummy
store-dw into the EMIT_FLUSH.
Fixes: 7dd4f6729f ("drm/i915: Async GPU relocation processing")
References: 7fa28e1469 ("drm/i915: Write GPU relocs harder with gen3")
Testcase: igt/gem_tiled_fence_blits # blb/pnv
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207134037.11848-1-chris@chris-wilson.co.uk
Although commit fb6f0b64e4 ("drm/i915: Prevent machine hang from
Broxton's vtd w/a and error capture") applied cleanly after a 24 month
hiatus, the code had moved on with new methods for peeking and fetching
the captured gpu info. Make sure we catch all uses of the stashed error
state and avoid dereferencing the error pointer.
v2: Move error pointer determination into i915_gpu_capture_state
v3: Restore early check to avoid capturing and then throwing away
subsequent GPU error states.
Fixes: fb6f0b64e4 ("drm/i915: Prevent machine hang from Broxton's vtd w/a and error capture")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207110554.19897-1-chris@chris-wilson.co.uk
Currently we face a severe problem on Braswell that manifests as invalid
ppGTT accesses. The code tries to maintain the PDP (page directory
pointers) inside the context in two ways, direct write into the context
and a pipelined LRI update. The direct write into the context is
fundamentally racy as it is unserialised with any access (read or write)
the GPU is doing. By asserting that Braswell is not used with vGPU
(currently an unsupported platform) we can eliminate the dangerous
direct write into the context image and solely use the pipelined update.
However, the LRI of the PDP fouls up the GPU, causing it to freeze and
take out the machine with "forcewake ack timeouts". This seems possible
to workaround by preventing the GPU from sleeping (via means of
disabling the power-state management interface, i.e. forcing each ring
to remain awake) around the update. Equally, it seems an EMIT_INVALIDATE
before the LRI is sufficient to prevent the forcewake errors.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
References: https://bugs.freedesktop.org/show_bug.cgi?id=108714
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207090213.14352-3-chris@chris-wilson.co.uk
For a lot of use cases we need 64bit sequence numbers. Currently drivers
overload the dma_fence structure to store the additional bits.
Stop doing that and make the sequence number in the dma_fence always
64bit.
For compatibility with hardware which can do only 32bit sequences the
comparisons in __dma_fence_is_later only takes the lower 32bits as significant
when the upper 32bits are all zero.
v2: change the logic in __dma_fence_is_later
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Link: https://patchwork.freedesktop.org/patch/266927/
Recently gvt shadow ctx create ppgtt table and this ppgtt's root
pointer is modified at workload dispatch, then we lose the original
ppgtt's root pointer, this causes the ppgtt destroy function abnormal
as it will release the wrong root table.
This patch save i915 context ppgtt root pointer at shadow
ctx creation and restore it at shadow ctx destruction.
v2: Split save and restore function (Zhenyu)
Fixes:4f15665ccbba("drm/i915: Add ppgtt to GVT GEM context")
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Host print below warning message when creating guest:
"gvt: vgpu(2) Invalid FORCE_NONPRIV write 83a8".
Register 0x83a8 should be in force-to-nonpriv whitelist as required by
guest
v2: update commit message to describe purpose of this patch in detail
(zhenyu wang)
Signed-off-by: Zhao Yan <yan.y.zhao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Final changes to drm-misc-next for v4.21:
UAPI Changes:
Core Changes:
- Add dma_fence_get_stub to dma-buf, and use it in drm/syncobj.
- Add and use DRM_MODESET_LOCK_BEGIN/END helpers.
- Small fixes to drm_atomic_helper_resume(), drm_mode_setcrtc() and
drm_atomic_helper_commit_duplicated_state()
- Fix drm_atomic_state_helper.[c] extraction.
Driver Changes:
- Small fixes to tinydrm, vkms, meson, rcar-du, virtio, vkms,
v3d, and pl111.
- vc4: Allow scaling and YUV formats on cursor planes.
- v3d: Enable use of the Texture Formatting Unit, and fix
prime imports of buffers from other drivers.
- Add support for the AUO G101EVN010 panel.
- sun4i: Enable support for the H6 display engine.
Signed-off-by: Dave Airlie <airlied@redhat.com>
[airlied: added drm/v3d: fix broken build to the merge commit]
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/321be9d3-ab75-5f92-8193-e5113662edef@linux.intel.com
We can move the remaining RCS workarounds applied to only gen8 to the
engine->wa_list, and then reduce all engine->init_hw callbacks to common
code. The benefit of using the new wa_list is that we verify that the
registers are indeed restored and keep their magic values.
v2: INSTPM_FORCE_ORDERING is already part of gen8_ctx_workarounds, and
as confirmed by the mmio verification is a part of the context image!
v3: MI_MODE is already part of gen8_ctx_workarounds...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181206180713.6827-2-chris@chris-wilson.co.uk
At enable/disable of the HDCP encryption, for encryption status change
we need minimum one frame duration. And we might program this bit any
point(start/End) in the previous frame.
With 20mSec, observed the timeout for change in encryption status.
Since this is not time critical operation and we need to hold on
until the status is changed, fixing the timeout to 50mSec. (Based on
trial and error method!)
v2:
%s/TIME_FOR_ENCRYPT_STATUS_CHANGE/ENCRYPT_STATUS_CHANGE_TIMEOUT_MS
[Sean Paul]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1544010283-20223-5-git-send-email-ramalingam.c@intel.com