According to Spec for SKL+: "Isochronous Priority Control.
If enabled, Display sends demoted requests once the transition
watermark is reached. If transition watermark is not enabled,
Display sends demoted requests when the display buffer is full."
The commit 'e57f1c02155f ("drm/i915/gen9+: Add has_ipc flag in
device info structure")' introduced that as gen9+ but missing many
SKL Skus.
I believe the reason for that is Spec also mentions workarounds for
SKL-ALL: "IPC (Isoch Priority Control) may cause underflows
WA: Do not enable IPC in register ARB_CTL2"
It seems lame to add the feature and forever disable it,
but it will avoid a mistake of enabling it when we are reorganizing
the feature definitions on i915_pci.c later.
It will also allow us to probably extend that workaround for
other platforms.
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003063652.17248-1-rodrigo.vivi@intel.com
This patch adds IPC support. This patch also enables IPC in all supported
platforms based on has_ipc flag.
IPC (Isochronous Priority Control) is the hardware feature, which
dynamically controls the memory read priority of Display.
When IPC is enabled, plane read requests are sent at high priority until
filling above the transition watermark, then the requests are sent at
lower priority until dropping below the level 0 watermark.
The lower priority requests allow other memory clients to have better
memory access. When IPC is disabled, all plane read requests are sent at
high priority.
Changes since V1:
- Remove commandline parameter to disable ipc
- Address Paulo's comments
Changes since V2:
- Address review comments
- Set ipc_enabled flag
Changes since V3:
- move ipc_enabled flag assignment inside intel_ipc_enable function
Changes since V4:
- Re-enable IPC after suspend/resume
Changes since V5:
- Enable IPC for all gen >=9 except SKL
Changes since V6:
- fix commit msg
- after resume program IPC based on SW state.
Changes since V7:
- Modify IPC support check based on HAS_IPC macro (suggested by Chris)
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817134529.2839-8-mahesh1.kumar@intel.com
GEN > 9 require transition WM to be programmed if IPC is enabled.
This patch calculates & enable transition WM for supported platforms.
If transition WM is enabled, Plane read requests are sent at high
priority until filling above the transition watermark, then the
requests are sent at lower priority until dropping below the level-0 WM.
The lower priority requests allow other memory clients to have better
memory access.
transition minimum is the minimum amount needed for trans_wm to work to
ensure the demote does not happen before enough data has been read to
meet the level 0 watermark requirements.
transition amount is configurable value. Higher values will
tend to cause longer periods of high priority reads followed by longer
periods of lower priority reads. Tuning to lower values will tend to
cause shorter periods of high and lower priority reads.
Keeping transition amount to 10 in this patch, as suggested by HW team.
Changes since V1:
- Address review comments from Maarten
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817134529.2839-4-mahesh1.kumar@intel.com
Skip compressing 1 segment at the end of the frame,
avoid a pixel count mismatch nuke event when last active
pixel and dummy pixel has same color for Odd Plane
Width / Height.
For both platforms Gemini Lake and Cannon Lake.
v2: Use function-like macro and also use mask to clean
to make sure bit 11 is 0. (Suggested by Paulo).
v3: Add Display WA notation and also apply for GLK.
Both Forgotten on v2.
Using "GLK_" prefix since GLK came before CNL.
v4: Forgot to "|=" when moving directly macro to masked
val. (Noticed by Paulo.)
v5: Rebased on top of 0a46ddd57c ("drm/i915/cnp: Wa 1181:
Fix Backlight issue")
Cc: Imre Deak <imre.deak@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170905193013.31710-1-rodrigo.vivi@intel.com
If we miss the current vblank because the gpu was busy, that may cause a
jitter as the frame rate temporarily drops. We try to limit the impact
of this by then boosting the GPU clock to deliver the frame as quickly
as possible. Originally done in commit 6ad790c0f5 ("drm/i915: Boost GPU
frequency if we detect outstanding pageflips") but was never forward
ported to atomic and finally dropped in commit fd3a40242e ("drm/i915:
Rip out legacy page_flip completion/irq handling").
One of the most typical use-cases for this is a mostly idle desktop.
Rendering one frame of the desktop's frontbuffer can easily be
accomplished by the GPU running at low frequency, but often exceeds
the time budget of the desktop compositor. The result is that animations
such as opening the menu, doing a fullscreen switch, or even just trying
to move a window around are slow and jerky. We need to respond within a
frame to give the best impression of a smooth UX, as a compromise we
instead respond if that first frame misses its goal. The result should
be a near-imperceivable initial delay and a smooth animation even
starting from idle. The cost, as ever, is that we spend more power than
is strictly necessary as we overestimate the required GPU frequency and
then try to ramp down.
This of course is reactionary, too little, too late; nevertheless it is
surprisingly effective.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102199
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817123706.6777-1-chris@chris-wilson.co.uk
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
This bit enables hardware that will change the approximation used for distances
calculations for AA wide lines so that they are rendered more accurately.
The default value for this bit leaves the legacy behavior. There is no good
reason to not enable the new approximation except if comparing to previous GEN
rendered images.
v2: Rebase
v3: Fix author.
Rebased by Rodrigo who also added a comment as suggested by Oscar.
Since it is surrounded by Workarounds let's just add a comment to
make clear it is not an Wa.
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-4-rodrigo.vivi@intel.com
Let's inherit workarounds from previous platforms that
according to wa_database and BSpec are still valid for
Cannonlake.
v2: Add missed workarounds.
v3: Rebase
v4: Remove bad chunk that was added to rc6 disable. (Ander)
Also remove A0 W/a that are not needed anymore.
v5: Rebase on top of CFL.
v6: Remove empty gen9_init_perctx_bb and gen9_init_indirectctx_bb
since they don't carry any gen10 related W/a. (by Oscar).
Also Remove A0 exclusive workaround.
v7: Remove more A0 exclusive workarounds. As pointed out by Oscar
many workarounds were changed to be A0 only so let's remove
them.
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-1-rodrigo.vivi@intel.com
SKL+ display engine can scan out certain kinds of compressed surfaces
produced by the render engine. This involved telling the display engine
the location of the color control surfae (CCS) which describes
which parts of the main surface are compressed and which are not. The
location of CCS is provided by userspace as just another plane with its
own offset.
Add the required stuff to validate the user provided AUX plane metadata
and convert the user provided linear offset into something the hardware
can consume.
Due to hardware limitations we require that the main surface and
the AUX surface (CCS) be part of the same bo. The hardware also
makes life hard by not allowing you to provide separate x/y offsets
for the main and AUX surfaces (excpet with NV12), so finding suitable
offsets for both requires a bit of work. Assuming we still want keep
playing tricks with the offsets. I've just gone with a dumb "search
backward for suitable offsets" approach, which is far from optimal,
but it works.
Also not all planes will be capable of scanning out compressed surfaces,
and eg. 90/270 degree rotation is not supported in combination with
decompression either.
This patch may contain work from at least the following people:
* Vandana Kannan <vandana.kannan@intel.com>
* Daniel Vetter <daniel@ffwll.ch>
* Ben Widawsky <ben@bwidawsk.net>
v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
v3: Pretend CCS tiles are regular 128 byte wide Y tiles (Jason)
Put the AUX register defines to the correct place
Fix up the slightly bogus rotation check
v4: Use I915_WRITE_FW() due to plane update locking changes
s/return -EINVAL/goto err/ in intel_framebuffer_init()
Eliminate a bunch hardcoded numbers in CCS code
v5: (By Ben)
conflict resolution +
- res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
+ res_blocks += fixed16_to_u32_round_up(y_tile_minimum);
v6: (daniels) Fix botched commit message.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ville Syrjä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Stone <daniels@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170801165817.7063-1-ben@bwidawsk.net
This patch make naming of fixed-point wrappers consistent
operation_<any_post_operation>_<1st operand>_<2nd operand>
also shorten the name for fixed_16_16 to fixed16
s/u32_to_fixed_16_16/u32_to_fixed16
s/fixed_16_16_to_u32/fixed16_to_u32
s/fixed_16_16_to_u32_round_up/fixed16_to_u32_round_up
s/min_fixed_16_16/min_fixed16
s/max_fixed_16_16/max_fixed16
s/mul_u32_fixed_16_16/mul_u32_fixed16
s/fixed_16_16_div/div_fixed16
Changes Since V1:
- Split the patch in more logical patches (Maarten)
Changes Since V2:
- Rebase
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705143154.32132-4-mahesh1.kumar@intel.com
Once a client has requested a waitboost, we keep that waitboost active
until all clients are no longer waiting. This is because we don't
distinguish which waiter deserves the boost. However, with the advent of
fence signaling, the signaler threads appear as waiters to the RPS
interrupt handler. So instead of using a single boolean to track when to
keep the waitboost active, use a counter of all outstanding waitboosted
requests.
At this point, I have removed all vestiges of the rate limiting on
clients. Whilst this means that compositors should remain more fluid,
it also means that boosts are more prevalent. See commit b29c19b645
("drm/i915: Boost RPS frequency for CPU stalls") for a longer discussion
on the pros and cons of both approaches.
A drawback of this implementation is that it requires constant request
submission to keep the waitboost trimmed (as it is now cancelled when the
request is completed). This will be fine for a busy system, but near
idle the boosts may be kept for longer than desired (effectively tens of
vblanks worstcase) and there is a reliance on rc6 instead.
v2: Remove defunct rps.client_lock
Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170628123548.9236-1-chris@chris-wilson.co.uk
A display resolution is only supported if it meets all the restrictions
below for Maximum Pipe Pixel Rate.
The display resolution must fit within the maximum pixel rate output
from the pipe. Make sure that the display pipe is able to feed pixels at
a rate required to support the desired resolution.
For each enabled plane on the pipe {
If plane scaling enabled {
Horizontal down scale amount = Maximum[1, plane horizontal size /
scaler horizontal window size]
Vertical down scale amount = Maximum[1, plane vertical size /
scaler vertical window size]
Plane down scale amount = Horizontal down scale amount *
Vertical down scale amount
Plane Ratio = 1 / Plane down scale amount
}
Else {
Plane Ratio = 1
}
If plane source pixel format is 64 bits per pixel {
Plane Ratio = Plane Ratio * 8/9
}
}
Pipe Ratio = Minimum Plane Ratio of all enabled planes on the pipe
If pipe scaling is enabled {
Horizontal down scale amount = Maximum[1, pipe horizontal source size /
scaler horizontal window size]
Vertical down scale amount = Maximum[1, pipe vertical source size /
scaler vertical window size]
Note: The progressive fetch - interlace display mode is equivalent to a
2.0 vertical down scale
Pipe down scale amount = Horizontal down scale amount *
Vertical down scale amount
Pipe Ratio = Pipe Ratio / Pipe down scale amount
}
Pipe maximum pixel rate = CDCLK frequency * Pipe Ratio
In this patch our calculation is based on pipe downscale amount
(plane max downscale amount * pipe downscale amount) instead of Pipe
Ratio. So,
max supported crtc clock with given scaling = CDCLK / pipe downscale.
Flip will fail if,
current crtc clock > max supported crct clock with given scaling.
Changes since V1:
- separate out fixed_16_16 wrapper API definition
Changes since V2:
- Fix buggy crtc !active condition (Maarten)
- use intel_wm_plane_visible wrapper as per Maarten's suggestion
Changes since V3:
- Change failure return from ERANGE to EINVAL
Changes since V4:
- Rebase based on previous patch changes
Changes since V5:
- return EINVAL instead of continue (Maarten)
Changes since V6:
- Improve commit message
- Address review comment
Changes since V7:
- use !enable instead of !active
- rename config variable for consistency (Maarten)
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170526151546.25025-4-mahesh1.kumar@intel.com
This patch implements new DDB allocation algorithm as per HW team
recommendation. This algo takecare of scenario where we allocate less DDB
for the planes with lower relative pixel rate, but they require more DDB
to work.
It also takes care of enabling same watermark level for each
plane in crtc, for efficient power saving.
Changes since v1:
- Rebase on top of Paulo's patch series
Changes since v2:
- Fix the for loop condition to enable WM
Changes since v3:
- Fix crash in cursor i-g-t reported by Maarten
- Rebase after addressing Paulo's comments
- Few other ULT fixes
Changes since v4:
- Rebase on drm-tip
- Added separate function to enable WM levels
Changes since v5:
- Fix a crash identified in skl-6770HQ system
Changes since v6:
- Address review comments from Matt
Changes since v7:
- Fix failure return in skl_compute_plane_wm (Matt)
- fix typo
Changes since v8:
- Always check cursor wm enable irrespective of total_data_rate
Changes since v9:
- fix typo
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170601055918.4601-1-mahesh1.kumar@intel.com
This patch make changes to use linetime latency if allocated
DDB size during plane watermark calculation is not available.
linetime is the time, display engine takes to fetch one line worth of
pixels with given pixel clock rate.
This is required to implement new DDB allocation algorithm.
In New Algorithm DDB is allocated based on WM values, because of which
number of DDB blocks will not be available during WM calculation,
So this "linetime latency" is suggested by SV/HW team to be used during
switch-case for WM blocks selection.
linetime latency us = pipe horizontal total pixels/adjusted pixel rate MHz
Changes since v1:
- Rebase on top of Paulo's patch series
Changes since v2:
- Fix if-else condition (pointed by Maarten)
Changes since v3:
- Use common function for timetime_us calculation (Paulo)
- rebase on drm-tip
Changes since v4:
- Use consistent name for fixed_point operation
Changes since v5:
- Improve commit message
- rename skl_get_linetime_us to intel_get_linetime_us
- fix watermark result selection (Matt)
Signed-off-by: "Mahesh Kumar" <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517115831.13830-11-mahesh1.kumar@intel.com