RPS provides a feedback loop where we use the load during the previous
evaluation interval to decide whether to up or down clock the GPU
frequency. Our responsiveness is split into 3 regimes, a high and low
plateau with the intent to keep the gpu clocked high to cover occasional
stalls under high load, and low despite occasional glitches under steady
low load, and inbetween. However, we run into situations like kodi where
we want to stay at low power (video decoding is done efficiently
inside the fixed function HW and doesn't need high clocks even for high
bitrate streams), but just occasionally the pipeline is more complex
than a video decode and we need a smidgen of extra GPU power to present
on time. In the high power regime, we sample at sub frame intervals with
a bias to upclocking, and conversely at low power we sample over a few
frames worth to provide what we consider to be the right levels of
responsiveness respectively. At low power, we more or less expect to be
kicked out to high power at the start of a busy sequence by waitboosting.
Prior to commit e9af4ea2b9 ("drm/i915: Avoid waitboosting on the active
request") whenever we missed the frame or stalled, we would immediate go
full throttle and upclock the GPU to max. But in commit e9af4ea2b9, we
relaxed the waitboosting to only apply if the pipeline was deep to avoid
over-committing resources for a near miss. Sadly though, a near miss is
still a miss, and perceptible as jitter in the frame delivery.
To try and prevent the near miss before having to resort to boosting
after the fact, we use the pageflip queue as an indication that we are
in an "interactive" regime and so should sample the load more frequently
to provide power before the frame misses it vblank. This will make us
more favorable to providing a small power increase (one or two bins) as
required rather than going all the way to maximum and then having to
work back down again. (We still keep the waitboosting mechanism around
just in case a dramatic change in system load requires urgent uplocking,
faster than we can provide in a few evaluation intervals.)
v2: Reduce rps_set_interactive to a boolean parameter to avoid the
confusion of what if they wanted a new power mode after pinning to a
different mode (which to choose?)
v3: Only reprogram RPS while the GT is awake, it will be set when we
wake the GT, and while off warns about being used outside of rpm.
v4: Fix deferred application of interactive mode
v5: s/state/interactive/
v6: Group the mutex with its principle in a substruct
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107111
Fixes: e9af4ea2b9 ("drm/i915: Avoid waitboosting on the active request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180731132629.3381-1-chris@chris-wilson.co.uk
(cherry picked from commit 60548c554b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We're doing a pointless translation from hpd_pin to port simply for
passing the thing to long_pulse_detect(). Let's pass the hpd_pin
directly instead.
This removes the assumption that the hpd_pin and port always
match. The only other place where we make that assumption anymore
is intel_hpd_pin_default() and that's fine as it's what determines
the relationship between the two. If we ever get hardware where
the hpd pins are wired in more interesting ways it should be
trivial to handle from now on.
This should also fix the IS_CNL_WITH_PORT_F() case as that mapped
pin E back to port F and passed that to
spt_port_hotplug2_long_detect() which would always return false
for port F. Now that we pass in pin E directly it'll actually
do the right thing.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: cf53902f48 ("drm/i915/cnl: Add HPD support for Port F.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-7-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Adjust the EIR clearing to cope with the edge triggered IIR
on i965/g4x. To guarantee an edge in the ISR master error bit
we temporarily mask everything in EMR. As some of the EIR bits
can't even be directly cleared we also borrow a trick from
i915_clear_error_registers() and permanently mask any bit that
remains high. No real thought given to how we might unmask them
again once the cause for the error has been clered. I suppose
on pre-g4x GPU reset will reinitialize EMR from scratch.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611200258.27121-3-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Just like with PIPESTAT, the edge triggered IIR on i965/g4x
also causes problems for hotplug interrupts. To make sure
we don't get the IIR port interrupt bit stuck low with the
ISR bit high we must force an edge in ISR. Unfortunately
we can't borrow the PIPESTAT trick and toggle the enable
bits in PORT_HOTPLUG_EN as that act itself generates hotplug
interrupts. Instead we just have to loop until we've cleared
PORT_HOTPLUG_STAT, or we just give up and WARN.
v2: Don't frob with PORT_HOTPLUG_EN
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180614175625.1615-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Now that we use the CSB stored in the CPU friendly HWSP, we do not need
to track interrupts for when the mmio CSB registers are valid and can
just check where we read up to last from the cached HWSP. This means we
can forgo the atomic bit tracking from interrupt, and in the next patch
it means we can check the CSB at any time.
v2: Change the splitting inside reset_prepare, we only want to lose
testing the interrupt in this patch, the next patch requires the change
in locking
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-8-chris@chris-wilson.co.uk
In the next patch, we will process the CSB events directly from the
submission path, rather than only after a CS interrupt. Hence, we will
no longer have the need for a loop until the has-interrupt bit is clear,
and in the meantime can remove that small optimisation.
v2: Tvrtko pointed out it was safer to unconditionally kick the tasklet
after each irq, when assuming that the tasklet is called for each irq.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628201211.13837-4-chris@chris-wilson.co.uk
This patch addresses Interrupts from south display engine (SDE).
ICP has two registers - SHOTPLUG_CTL_DDI and SHOTPLUG_CTL_TC.
Introduce these registers and their intended values.
Introduce icp_irq_handler().
The icp_irq_postinstall() takes care of
enabling all PCH interrupt sources, to unmask
them as needed with SDEIMR, as is done
done by ibx_irq_pre_postinstall() for earlier platforms.
We do not need to explicitly call the ibx_irq_pre_postinstall().
Also, while changing these,
s/CPT/PPT/CPT-CNP comment.
v2:
- remove redundant register defines.(Lucas)
- Change register names to be more consistent with
previous platforms (Lucas)
v3:
-Reorder bit defines to a more appropriate location.
Change the comments. Confirm in the commit message that
icp_irq_postinstall() need not go to
ibx_irq_pre_postinstall() and ibx_irq_postinstall()
as in earlier platforms. (Paulo)
Cc: Lucas De Marchi <lucas.de.marchi@gmail.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
[Paulo: coding style bikesheds and rebases].
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1530046343-30649-1-git-send-email-anusha.srivatsa@intel.com
On i965/g4x IIR is edge triggered. So in order for IIR to notice that
there is still a pending interrupt we have to force and edge in ISR.
For the ISR/IIR pipe event bits we can do that by temporarily
clearing all the PIPESTAT enable bits when we ack the status bits.
This will force the ISR pipe event bit low, and it can then go back
high when we restore the PIPESTAT enable bits.
This avoids the following race:
1. stat = read(PIPESTAT)
2. an enabled PIPESTAT status bit goes high
3. write(PIPESTAT, enable|stat);
4. write(IIR, PIPE_EVENT)
The end result is IIR==0 and ISR!=0. This can lead to nasty
vblank wait/flip_done timeouts if another interrupt source
doesn't trick us into looking at the PIPESTAT status bits despite
the IIR PIPE_EVENT bit being low.
Before i965 IIR was level triggered so this problem can't actually
happen there. And curiously VLV/CHV went back to the level triggered
scheme as well. But for simplicity we'll use the same i965/g4x
compatible code for all platforms.
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106033
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105225
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106030
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180611200258.27121-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Interrupts other than the one for AUX errors are required only for debug,
so unmask them via debugfs when the user requests debug.
User can make such a request with
echo 1 > <DEBUG_FS>/dri/0/i915_edp_psr_debug
There are no locks to serialize PSR debug enabling from
irq_postinstall() and debugfs for simplicity. As irq_postinstall() is
called only during module initialization/resume and IGT subtests
aren't expected to modify PSR debug at those times, we should be safe.
v2: Unroll loops (Ville)
Avoid resetting error mask bits.
v3: Unmask interrupts in postinstall() if debug was still enabled.
Avoid RMW (Ville)
v4: Avoid extra IMR write introduced in the previous version.(Jose)
Style changes, renames (Jose).
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Jose Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180405013717.24254-1-dhinakaran.pandiyan@intel.com
Currently, we rely on inspecting the hangcheck state from within the
i915_reset() routines to determine which engines were guilty of the
hang. This is problematic for cases where we want to run
i915_handle_error() and call i915_reset() independently of hangcheck.
Instead of relying on the indirect parameter passing, turn it into an
explicit parameter providing the set of stalled engines which then are
treated as guilty until proven innocent.
While we are removing the implicit stalled parameter, also make the
reason into an explicit parameter to i915_reset(). We still need a
back-channel for i915_handle_error() to hand over the task to the locked
waiter, but let's keep that its own channel rather than incriminate
another.
This leaves stalled/seqno as being private to hangcheck, with no more
nefarious snooping by reset, be it whole-device or per-engine. \o/
The only real issue now is that this makes it crystal clear that we
don't actually do any testing of hangcheck per se in
drv_selftest/live_hangcheck, merely of resets!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406220354.18911-2-chris@chris-wilson.co.uk
BSpec says:
"Second level interrupt events are stored in the GT INT DW. GT INT DW is
a double buffered structure. A snapshot of events is taken when SW reads
GT INT DW. From the time of read to the time of SW completely clearing
GT INT DW (to indicate end of service), all incoming interrupts are logged
in a secondary storage structure. this guarantees that the record of
interrupts SW is servicing will not change while under service".
We read GT INT DW in several places now:
- The IRQ handler (banks 0 and 1) where, hopefully, it is completely
cleared (operation now covered with the irq lock).
- The 'reset' interrupts functions for RPS and GuC logs, where we clear
the bit we are interested in and leave the others for the normal
interrupt handler.
- The 'enable' interrupts functions for RPS and GuC logs, as a measure
of precaution. Here we could relax a bit and don't check GT INT DW
at all or, if we do, at least we should clear the offending bit
(which is what this patch does).
Note that, if every bit is cleared on reading GT INT DW, the register
won't be locked. Also note that, according to the BSpec, GT INT DW
cannot be cleared without first servicing the Selector & Shared IIR
registers.
v2:
- Remove some code duplication (Tvrtko)
- Make sure GT_INTR_DW are protected by the irq spinlock, since it's a
global resource (Tvrtko)
v3: Optimize the spinlock (Tvrtko)
v4: Rebase.
v5:
- take spinlock on outer scope to please sparse (Mika)
- use raw_reg accessors (Mika)
v6: omit the continue in looping banks (Michel)
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v4)
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406093237.14548-1-mika.kuoppala@linux.intel.com
Not all callers want the GPU error to handled in the same way, so expose
a control parameter. In the first instance, some callers do not want the
heavyweight error capture so add a bit to request the state to be
captured and saved.
v2: Pass msg down to i915_reset/i915_reset_engine so that we include the
reason for the reset in the dev_notice(), superseding the earlier option
to not print that notice.
v3: Stash the reason inside the i915->gpu_error to handover to the direct
reset from the blocking waiter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180320100449.1360-2-chris@chris-wilson.co.uk
Originally we were inlining gen8_cs_irq_handler() and so expected the
compiler to constant-fold away the irq_shift (so we had hardcoded it as
opposed to use engine->irq_shift). However, we dropped the inline given
the proliferation of gen8_cs_irq_handler()s. If we pull the shifting
of the iir into the caller, we can shrink the code still further:
add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-34 (-34)
Function old new delta
gen8_cs_irq_handler 123 118 -5
gen8_gt_irq_handler 261 248 -13
gen11_irq_handler 722 706 -16
v2: Drop gen11_cs_irq_handler now that it is a simple
stub around gen8_cs_irq_handler (Daniele)
References: 5d3d69d5c1 ("drm/i915: Stop inlining the execlists IRQ handler")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309010808.11921-1-chris@chris-wilson.co.uk
v2: Rebase.
v3:
* Remove DPF, it has been removed from SKL+.
* Fix -internal rebase wrt. execlists interrupt handling.
v4: Rebase.
v5:
* Updated for POR changes. (Daniele Ceraolo Spurio)
* Merged with irq handling fixes by Daniele Ceraolo Spurio:
* Simplify the code by using gen8_cs_irq_handler.
* Fix interrupt handling for the upstream kernel.
v6:
* Remove early bringup debug messages (Tvrtko)
* Add NB about arbitrary spin wait timeout (Tvrtko)
v7 (from Paulo):
* Don't try to write RO bits to registers.
* Don't check for PCH types that don't exist. PCH interrupts are not
here yet.
v9:
* squashed in selector and shared register handling (Daniele)
* skip writing of irq if data is not valid (Daniele)
* use time_after32 (Chris)
* use I915_MAX_VCS and I915_MAX_VECS (Daniele)
* remove fake pm interrupt handling for later patch (Mika)
v10:
* Direct processing of banks. clear banks early (Chris)
* remove poll on valid bit, only clear valid bit (Mika)
* use raw accessors, better naming (Chris)
v11:
* adapt to raw_reg_[read|write]
* bring back polling the valid bit (Daniele)
v12:
* continue if unset intr_dw (Daniele)
* comment the usage of gen8_de_irq_handler bits (Daniele)
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180228101153.7224-2-mika.kuoppala@linux.intel.com
The compiler is not automatically caching the i915->regs address inside
a register and emitting a load for every mmio access. For simple
functions like gen8_gt_irq_handler that are already using the raw
accessors, we can open-code them for substantial savings:
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-83 (-83)
Function old new delta
gen8_gt_irq_handler 290 266 -24
gen8_gt_irq_ack 181 122 -59
Total: Before=954637, After=954554, chg -0.01%
v2: Add raw_reg_read/raw_reg_write.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180219100926.16554-1-chris@chris-wilson.co.uk
The frame counter may have got reset between disabling and enabling vblank
interrupts due to DMC putting the hardware to DC5/6 states if PSR was
active. The frame counter could also have stalled if PSR was active in case
there was no DMC. The frame counter resetting has a user visible impact
of screen freezes.
Make use of drm_vblank_restore() to compute missed vblanks for the duration
in which vblank interrupts were disabled and update the vblank counter with
this value as diff. There's no need to check if PSR was actually active in
the interrupt disabled duration, so simplify the check to a feature check.
Enabling vblank interrupts wakes up the hardware from DC5/6 and prevents it
from going back again as long as the there are pending interrupts. So, we
don't have to explicity disallow DC5/6 after enabling vblank interrupts to
keep the counter running.
This change is not applicable to CHV, as enabling interrupts does not
prevent the hardware from activating PSR.
v2: Added comments(Rodrigo) and rewrote commit message.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180203051302.9974-10-dhinakaran.pandiyan@intel.com
On some Cannonlake SKUs we have a dedicated Aux for port F,
that is only the full split between port A and port E.
There is still no Aux E for Port E, as in previous platforms,
because port_E still means shared lanes with port A.
v2: Rebase.
v3: Add couple missed PORT_F cases on intel_dp.
v4: Rebase and fix commit message.
v5: Squash Imre's "drm/i915: Add missing AUX_F power well string"
v6: Rebase on top of display headers rework.
v7: s/IS_CANNONLAKE/IS_CNL_WITH_PORT_F (DK)
v8: Fix Aux bits for Port F (DK)
v9: Fix VBT definition of Port F (DK).
v10: Squash power well addition to this patch to avoid
warns as pointed by DK.
v11: Clean up squashed commit message. (David)
v12: Remove unnecessary handling for older platforms (DK)
Adding AUX_F to PG2 following other existent ones. (DK)
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180129232223.766-2-rodrigo.vivi@intel.com