As time goes by, usage of generic ioctls such as drm_syncobj and
sync_file are on the increase bypassing i915-specific ioctls like
GEM_WAIT. Currently, we only apply waitboosting to our driver ioctls as
we track the file/client and account the waitboosting to them. However,
since commit 7b92c1bd05 ("drm/i915: Avoid keeping waitboost active for
signaling threads"), we no longer have been applying the client
ratelimiting on waitboosts and so that information has only been used
for debug tracking.
Push the application of waitboosting down to the common
i915_request_wait, and apply it to all foreign fence waits as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190213092504.25709-1-chris@chris-wilson.co.uk
Currently we're only dumping out the ddb allocation changes, let's do
the same for the watermarks. This should help with debugging underruns
and whatnot.
First I tried one line per plane per wm level, but that resulted in
an obnoxious amount of lines printed. So as a compromise I settled
on a four line format, each line containing a single watermark related
value (enable,lines,blocks,min_ddb_alloc) for all 8 levels (+trans wm).
It still produces quite a lot of output but I can't really see a way
around that because we simply have a lot of data to dump.
Let's also pimp the ddb debug to print the size of the allocations
too, not just their bounds. Makes it a bit easier to compare against
the watermarks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190208200527.12844-1-ville.syrjala@linux.intel.com
Reviewed-by: Clint Taylor <Clinton.A.Taylor@intel.com>
When adding the early latency==0 check back I neglected to
realize that we no longer have a way to return a failure
from the wm computation like we had in the past (since we
now calculate wms before ddb allocations). Also plane_en
being false doesn't actually indicate that the level is
invalid as it wil also happen when the plane is not
enabled.
skl_allocate_pipe_ddb() starts scanning from the maximum
watermark level and it stops as soon as it finds a level
that is deemed viable. The assumption being that if level
n+1 is valid then level n is valid as well. Thus if we
now disable any watermark level by zeroing its latency
the code will think that level to be actually valid
and won't confirm whether the actually enabled lower
watermark level(s) actually fit into the allotted ddb
space. This results in hilarious watermark values that
exceed the ddb allocation of the plane.
The way we must now indicate a failure is to assign an
unreasoanbly big value to min_ddb_alloc which will then
make skl_allocate_pipe_ddb() reject the entire level.
v2: Also do the same for the lines>31 case (Matt)
v3: Make 'blocks' u32 (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190205155053.10081-1-ville.syrjala@linux.intel.com
The spec doesn't use a definite article in front of SAGV. The
rules regarding articles and initialisms are super fuzzy, but
at least to my ears it sounds much more natural to not have
the article. Perhaps because I tend to pronounce it as
"sag-vee" instead of spelling out the letters one at a time.
Actually I might still prefer to leave out the article if I
did spell them out.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-8-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On icl+ bspec tells us to calculate a separate minimum ddb
allocation from the blocks watermark. Both have to be checked
against the actual ddb allocation, but since we do things the
other way around we'll just calculat the minimum acceptable
ddb allocation by taking the maximum of the two values.
We'll also replace the memcmp() with a full trawl over the
the watermarks so that it'll ignore the min_ddb_alloc
because we can't directly read that out from the hw. I suppose
we could reconstruct it from the other values, but I was
too lazy to do that now.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181221171436.8218-6-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Currently Ironlake operates under the assumption that rpm awake (and its
error checking is disabled). As such, we have missed a few places where we
access registers without taking the rpm wakeref and thus trigger
warnings. intel_ips being one culprit.
As this involved adding a potentially sleeping rpm_get, we have to
rearrange the spinlocks slightly and so switch to acquiring a device-ref
under the spinlock rather than hold the spinlock for the whole
operation. To be consistent, we make the change in pattern common to the
intel_ips interface even though this adds a few more atomic operations
than necessary in a few cases.
v2: Sagar noted the mb around setting mch_dev were overkill as we only
need ordering there, and that i915_emon_status was still using
struct_mutex for no reason, but lacked rpm.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-21-chris@chris-wilson.co.uk
The majority of runtime-pm operations are bounded and scoped within a
function; these are easy to verify that the wakeref are handled
correctly. We can employ the compiler to help us, and reduce the number
of wakerefs tracked when debugging, by passing around cookies provided
by the various rpm_get functions to their rpm_put counterpart. This
makes the pairing explicit, and given the required wakeref cookie the
compiler can verify that we pass an initialised value to the rpm_put
(quite handy for double checking error paths).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190114142129.24398-16-chris@chris-wilson.co.uk
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.
The following spatch was used to convert the users of these macros:
@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)
v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
using the bitmask
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
The DDB allocation algorithm currently used by the driver grants each
plane a very small minimum allocation of DDB blocks and then divies up
all of the remaining blocks based on the percentage of the total data
rate that the plane makes up. It turns out that this proportional
allocation approach is overly-generous with the larger planes and can
leave very small planes wthout a big enough allocation to even hit their
level 0 watermark requirements (especially on APL, which has a smaller
DDB in general than other gen9 platforms). Or there can be situations
where the smallest planes hit a lower watermark level than they should
have been able to hit with a more equitable division of DDB blocks, thus
limiting the overall system sleep state that can be achieved.
The bspec now describes an alternate algorithm that can be used to
overcome these types of issues. With the new algorithm, we calculate
all plane watermark values for all wm levels first, then go back and
partition a pipe's DDB space second. The DDB allocation will calculate
what the highest watermark level that can be achieved on *all* active
planes, and then grant the blocks necessary to hit that level to each
plane. Any remaining blocks are then divided up proportionally
according to data rate, similar to the old algorithm.
There was a previous attempt to implement this algorithm a couple years
ago in bb9d85f6e9 ("drm/i915/skl: New ddb allocation algorithm"), but
some regressions were reported, the patch was reverted, and nobody
ever got around to figuring out exactly where the bug was in that
version. Our watermark code has evolved significantly in the meantime,
but we're still getting bug reports caused by the unfair proportional
algorithm, so let's give this another shot.
v2:
- Make sure cursor allocation stays constant and fixed at the end of
the pipe allocation.
- Fix some watermark level iterators that weren't handling the max
level.
v3:
- Ensure we don't leave any DDB blocks unused by using DIV_ROUND_UP+min
to calculate the extra blocks for each plane. (Ville)
- Replace a while() loop with a for() loop to be more consistent with
surrounding code. (Ville)
- Clean unattainable watermark levels with memset rather than directly
clearing the member fields. Also do the same for the transition
watermark values if they can't be achieved. (Ville)
- Drop min_disp_buf_needed calculations in skl_compute_plane_wm() since
the results are no longer needed or used. (Ville)
- Drop skl_latency[0] != 0 sanity check; both watermark methods already
account for an invalid 0 latency by returning FP_16_16_MAX. (Ville)
v4:
- Break DDB allocation loop when total_data_rate=0 rather than
alloc_size=0. If total_data_rate has dropped to 0, all remaining
planes are disabled, which isn't true for alloc_size (we might just
have not had any remaining blocks to hand out). Plus
total_data_rate=0 is the case we need to avoid to a prevent a
div-by-0. (Ville)
- s/DIV_ROUND_UP/DIV64_U64_ROUND_UP/ to prevent 32-bit breakage (Ville)
v5:
- Don't forget to move 'start' pointer forward for UV surface when
setting plane DDB boundaries. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105458
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181211173107.11068-2-matthew.d.roper@intel.com
The bspec gives an if/else chain for choosing whether to use "method 1"
or "method 2" for calculating the watermark "Selected Result Blocks"
value for a plane. One of the branches of the if chain is:
"Else If ('plane buffer allocation' is known and (plane buffer
allocation / plane blocks per line) >=1)"
Since our driver currently calculates DDB allocations first and the
actual watermark values second, the plane buffer allocation is known at
this point in our code and we include this test in our driver's logic.
However we plan to soon move to a "watermarks first, ddb allocation
second" sequence where we won't know the DDB allocation at this point.
Let's drop this arm of the if/else statement (effectively considering
the DDB allocation unknown) as an independent patch so that any
regressions can be more accurately bisected to either the different
watermark value (in this patch) or the new DDB allocation (in the next
patch).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181211173107.11068-1-matthew.d.roper@intel.com
On SKL+ the plane WM/BUF_CFG registers are a proper part of each
plane's register set. That means accessing them will cancel any
pending plane update, and we would need a PLANE_SURF register write
to arm the wm/ddb change as well.
To avoid all the problems with that let's just move the wm/ddb
programming into the plane update/disable hooks. Now all plane
registers get written in one (hopefully atomic) operation.
To make that feasible we'll move the plane ddb tracking into
the crtc state. Watermarks were already tracked there.
v2: Rebase due to input CSC
v3: Split out a bunch of junk (Matt)
v4: Add skl_wm_add_affected_planes() to deal with
cursor special case and non-zero wm register reset value
v5: Drop the unrelated for_each_intel_plane_mask() fix (Matt)
Remove the redundant ddb memset() (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> #v3
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181127165900.31298-1-ville.syrjala@linux.intel.com
I have a Thinkpad X220 Tablet in my hands that is losing vblank
interrupts whenever LP3 watermarks are used.
If I nudge the latency value written to the WM3 register just
by one in either direction the problem disappears. That to me
suggests that the punit will not enter the corrsponding
powersave mode (MPLL shutdown IIRC) unless the latency value
in the register matches exactly what we read from SSKPD. Ie.
it's not really a latency value but rather just a cookie
by which the punit can identify the desired power saving state.
On HSW/BDW this was changed such that we actually just write
the WM level number into those bits, which makes much more
sense given the observed behaviour.
We could try to handle this by disallowing LP3 watermarks
only when vblank interrupts are enabled but we'd first have
to prove that only vblank interrupts are affected, which
seems unlikely. Also we can't grab the wm mutex from the
vblank enable/disable hooks because those are called with
various spinlocks held. Thus we'd have to redesigne the
watermark locking. So to play it safe and keep the code
simple we simply disable LP3 watermarks on all SNB machines.
To do that we simply zero out the latency values for
watermark level 3, and we adjust the watermark computation
to check for that. The behaviour now matches that of the
g4x/vlv/skl wm code in the presence of a zeroed latency
value.
v2: s/USHRT_MAX/U32_MAX/ for consistency with the types (Chris)
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101269
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103713
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181114173440.6730-1-ville.syrjala@linux.intel.com